[PATCH RFC v4 00/16] new cgroup controller for gpu/drm subsystem

Michal Hocko mhocko at kernel.org
Tue Sep 10 11:54:48 UTC 2019


On Fri 06-09-19 08:45:39, Tejun Heo wrote:
> Hello, Daniel.
> 
> On Fri, Sep 06, 2019 at 05:34:16PM +0200, Daniel Vetter wrote:
> > > Hmm... what'd be the fundamental difference from slab or socket memory
> > > which are handled through memcg?  Is system memory used by GPUs have
> > > further global restrictions in addition to the amount of physical
> > > memory used?
> > 
> > Sometimes, but that would be specific resources (kinda like vram),
> > e.g. CMA regions used by a gpu. But probably not something you'll run
> > in a datacenter and want cgroups for ...
> > 
> > I guess we could try to integrate with the memcg group controller. One
> > trouble is that aside from i915 most gpu drivers do not really have a
> > full shrinker, so not sure how that would all integrate.
> 
> So, while it'd great to have shrinkers in the longer term, it's not a
> strict requirement to be accounted in memcg.  It already accounts a
> lot of memory which isn't reclaimable (a lot of slabs and socket
> buffer).

Yeah, having a shrinker is preferred but the memory should better be
reclaimable in some form. If not by any other means then at least bound
to a user process context so that it goes away with a task being killed
by the OOM killer. If that is not the case then we cannot really charge
it because then the memcg controller is of no use. We can tolerate it to
some degree if the amount of memory charged like that is negligible to
the overall size. But from the discussion it seems that these buffers
are really large.
-- 
Michal Hocko
SUSE Labs


More information about the amd-gfx mailing list