[rfc] drm/ttm/memcg: simplest initial memcg/ttm integration (v2)

Sun May 18 16:28:43 UTC 2025

On 5/16/25 22:25, Dave Airlie wrote:
> On Sat, 17 May 2025 at 06:04, Johannes Weiner <hannes at cmpxchg.org> wrote:
>>> The memory properties are similar to what GFP_DMA or GFP_DMA32
>>> provide.
>>>
>>> The reasons we haven't moved this into the core memory management is
>>> because it is completely x86 specific and only used by a rather
>>> specific group of devices.
>>
>> I fully understand that. It's about memory properties.
>>
>> What I think you're also saying is that the best solution would be
>> that you could ask the core MM for pages with a specific property, and
>> it would hand you pages that were previously freed with those same
>> properties. Or, if none such pages are on the freelists, it would grab
>> free pages with different properties and convert them on the fly.
>>
>> For all intents and purposes, this free memory would then be trivially
>> fungible between drm use, non-drm use, and different cgroups - except
>> for a few CPU cycles when converting but that's *probably* negligible?
>> And now you could get rid of the "hack" in drm and didn't have to hang
>> on to special-property pages and implement a shrinker at all.
>>
>> So far so good.
>>
>> But that just isn't the implementation of today. And the devil is very
>> much in the details with this:
>>
>> Your memory attribute conversions are currently tied to a *shrinker*.
>>
>> This means the conversion doesn't trivially happen in the allocator,
>> it happens from *reclaim context*.

Ah! At least I now understand your concern here.

>> Now *your* shrinker is fairly cheap to run, so I do understand when
>> you're saying in exasperation: We give this memory back if somebody
>> needs it for other purposes. What *is* the big deal?
>>
>> The *reclaim context* is the big deal. The problem is *all the other
>> shrinkers that run at this time as well*. Because you held onto those
>> pages long enough that they contributed to a bonafide, general memory
>> shortage situation. And *that* has consequences for other cgroups.

No it doesn't, or at least not as much as you think.

We have gone back and forth on this multiple times already when discussion the shrinker implementations. See the DRM mailing list about both the TTM and the GEM shared mem shrinker.

The TTM pool shrinker is basically just a nice to have feature which is used to avoid deny of service attacks and allows to kick in when use cases change. E.g. between installing software (gcc) and running software (Blender, ROCm etc..).

In other words the TTM shrinker is not even optimized and spends tons of extra CPU cycles because the expectation is that it never really triggers in practice.

> I think this is where we have 2 options:
> (a) moving this stuff into core mm and out of shrinker context
> (b) fix our shrinker to be cgroup aware and solve that first.

(c) give better priorities to the shrinker API.

E.g. the shrinker for example assumes that the users of the API must scan the pages to be able to clean them up.

But implementations like the TTM pool could basically just throw away pages as many as necessary.

So by saying to the shrinker please ask us on reclaim first we would completely solve the problem.

That was considered before but never done because it's basically just nice to have and most likely not really important.

> The main question I have for Christian, is can you give me a list of
> use cases that this will seriously negatively effect if we proceed
> with (b).

It would basically render the whole TTM pool useless, or at least massively limit its usefulness.

See the main benefit is to be able to quickly allocate buffers on HW use cases which needs then, e.g. scanout on APUs, PSP for secure playback etc....

The idea is that when you alt+tab or swipe between applications that the new application can just grab the memory the previous application has just released.

And yes, it is explicitly required that those applications can be in different cgroups.

> From my naive desktop use case and HPC use case scenarios, I'm not
> seeing a massive hit, now maybe I see more consistency from an
> application overheads inside a cgroup.

Yeah for HPC it is most likely completely irrelevant, for desktop it might have some minor use cases.

But the killer argument is that we do have some cloud gaming and embedded use cases where it is really important to get this right.

> Android? I've no idea.

Mainline Android currently has it's complete own way of doing mostly the same what cgroup does, but in userspace.

The problem is that this doesn't account for memory allocated in kernel space. See the discussion on DMA-buf accounting with T.J. and the older discussion with Greg (sysfs) about that.

In my opinion it would make sense to just use cgroups for a lot of that as well, but we would need to convince Google of that.

Regards,
Christian.

> Like what can we live with here, vs what needs to be a Kconfig option
> vs what needs to be a kernel command line option,
> 
> I'm also happy to look at (a) but I think for (a) it's not just
> uncached pool that is the problem, the dma pools will be harder to
> deal with.
> 
> Dave.