[rfc] drm/ttm/memcg: simplest initial memcg/ttm integration
Christian König
christian.koenig at amd.com
Tue Apr 29 07:29:24 UTC 2025
On 4/28/25 21:31, Dave Airlie wrote:
> On Mon, 28 Apr 2025 at 20:43, Christian König <christian.koenig at amd.com> wrote:
>>
>> On 4/23/25 23:37, Dave Airlie wrote:
>>> Hey,
>>>
>>> I've been tasked to look into this, and I'm going start from hopeless
>>> naivety and see how far I can get. This is an initial attempt to hook
>>> TTM system memory allocations into memcg and account for them.
>>
>> Yeah, this looks mostly like what we had already discussed.
>>
>>>
>>> It does:
>>> 1. Adds memcg GPU statistic,
>>> 2. Adds TTM memcg pointer for drivers to set on their user object
>>> allocation paths
>>> 3. Adds a singular path where we account for memory in TTM on cached
>>> non-pooled non-dma allocations. Cached memory allocations used to be
>>> pooled but we dropped that a while back which makes them the best target
>>> to start attacking this from.
>>
>> I think that should go into the resource like the existing dmem approach instead. That allows drivers to control the accounting through the placement which is far less error prone than the context.
>
> I'll reconsider this, but I'm not sure it'll work at that level,
> because we have to handle the fact that when something gets put back
> into the pool it gets removed from the cgroup kmem accounting and
> taken from the pool gets added to the cgroup kmem account, but
> otherwise we just use __GFP_ACCOUNT on allocations.
Especially for the user queue case a lot of those allocations are done from a background worker were simply using __GFP_ACCOUNT doesn't work.
We need to track for each BO who created it and either switch to that group before allocations or just account to it directly.
> I've added cached
> pool support yesterday, which just leaves the dma paths which probably
> aren't too insane, but I'll re-evaluate this and see if higher level
> makes sense.
The DMA path is still used quite often, especially on laptops with APUs and limited addressing capabilities as well as basically all non-x86 architectures.
>
>>> 4. It only accounts for memory that is allocated directly from a userspace
>>> TTM operation (like page faults or validation). It *doesn't* account for
>>> memory allocated in eviction paths due to device memory pressure.
>>
>> Yeah, that's something I totally agree on.
>>
>> But the major show stopper is still accounting to memcg will break existing userspace. E.g. display servers can get attacked with a deny of service with that.
>
> The thing with modern userspace, I'm not sure this out of the box is a
> major problem, we usually run the display server and the user
> processes in the same cgroup, so they share limits. Most modern
> distros don't run X.org servers as root in a separate cgroup, even
> running X is usually in the same cgroup as the users of it, Android
> might have different opinions of course, but I'd probably suggest we
> Kconfig this stuff and let distros turn it on once we agree on a
> baseline.
>
>>>
>>> This seems to work for me here on my hacked up tests systems at least, I
>>> can see the GPU stats moving and they look sane.
>>>
>>> Future work:
>>> Account for pooled non-cached
>>> Account for pooled dma allocations (no idea how that looks)
>>> Figure out if accounting for eviction is possible, and what it might look
>>> like.
>>
>> T.J. suggested to account but don't limit the evictions and I think that should work.
>>
>
> I was going to introduce an gpu eviction stat counter as a start, I
> also got the idea that might be a bit hard to pull off, but if a
> process needs to evict from VRAM, but the original process has no
> space in it's cgroup, we just fail the VRAM allocation for the current
> process, which didn't sound insane,
That is insane.
The problem is that you can't let the allocation of one process fail because another process has reached it's limit.
That basically kills all reproducibility because userspace can't figure out why an allocation failed.
Regards,
Christian.
> but I haven't considered how
> implementing that in TTM might look.
>
> Dave.
More information about the dri-devel
mailing list