[rfc] drm/ttm/memcg: simplest initial memcg/ttm integration (v2)
Johannes Weiner
hannes at cmpxchg.org
Fri May 16 16:41:50 UTC 2025
On Fri, May 16, 2025 at 05:35:12PM +0200, Christian König wrote:
> On 5/16/25 16:53, Johannes Weiner wrote:
> > On Fri, May 16, 2025 at 08:53:07AM +0200, Christian König wrote:
> >> The cgroup who originally allocated it has no reference to the
> >> memory any more and also no way of giving it back to the core
> >> system.
> >
> > Of course it does, the shrinker LRU.
>
> No it doesn't. The LRU handling here is global and not per cgroup.
Well, the discussion at hand is that it should be.
> > Listen, none of this is even remotely new. This isn't the first cache
> > we're tracking, and it's not the first consumer that can outlive the
> > controlling cgroup.
>
> Yes, I knew about all of that and I find that extremely questionable
> on existing handling as well.
This code handles billions of containers every day, but we'll be sure
to consult you on the next redesign.
> Memory pools which are only used to improve allocation performance
> are something the kernel handles transparently and are completely
> outside of any cgroup tracking whatsoever.
You're describing a cache. It doesn't matter whether it's caching CPU
work, IO work or network packets.
What matters is what it takes to recycle those pages for other
purposes - especially non-GPU purposes.
And more importantly, *what other memory in other cgroups they
displace in the meantime*.
It's really not that difficult to see an isolation issue here.
Anyway, it doesn't look like there is a lot of value in continuing
this conversation, so I'm going to check out of this subthread.
More information about the dri-devel
mailing list