DRM cgroups integration (Was: Re: [PATCH v4 0/8] cgroup private data and DRM/i915 integration)

Thu Apr 5 13:46:51 UTC 2018

+ Dave for commenting from DRM subsystem perspective. I strongly believe
there would be benefit from agreeing on some foundation of DRM subsystem
level program GPU niceness [-20,19] and memory limit [0,N] pages.

Quoting Matt Roper (2018-03-30 03:43:13)
> On Mon, Mar 26, 2018 at 10:30:23AM +0300, Joonas Lahtinen wrote:
> > Quoting Matt Roper (2018-03-23 17:46:16)
> > > On Fri, Mar 23, 2018 at 02:15:38PM +0200, Joonas Lahtinen wrote:
> > > > Quoting Matt Roper (2018-03-17 02:08:57)
> > > > > This is the fourth iteration of the work previously posted here:
> > > > >   (v1) https://lists.freedesktop.org/archives/intel-gfx/2018-January/153156.html
> > > > >   (v2) https://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg208170.html
> > > > >   (v3) https://lists.freedesktop.org/archives/intel-gfx/2018-March/157928.html
> > > > > 
> > > > > The high level goal of this work is to allow non-cgroup-controller parts
> > > > > of the kernel (e.g., device drivers) to register their own private
> > > > > policy data for specific cgroups.  That mechanism is then made use of in
> > > > > the i915 graphics driver to allow GPU priority to be assigned according
> > > > > to the cgroup membership of the owning process.  Please see the v1 cover
> > > > > letter linked above for a more in-depth explanation and justification.
> > > > 
> > > > Hi Matt,
> > > > 
> > > > For cross-subsystem changes such as this, it makes sense to Cc all
> > > > relevant maintainers, especially if there have been previous comments to
> > > > earlier revisions.
> > > > 
> > > > Please, do include and keep a reference to the userspace portion of the
> > > > changes when you suggest new uAPI to be added. At least I have some trouble
> > > > trying to track down the relevant interface consumer here.
> > > > 
> > > > I'm unsure how much sense it makes to commence with detailed i915 review
> > > > if we will be blocked by lack of userspace after that? I'm assuming
> > > > you've read through [1] already.
> > > 
> > > Hi Joonas,
> > > 
> > > I've sent the userspace code out a few times, but it looks like I forgot
> > > to include a copy with v4/v4.5.  Here's the version I provided with v3:
> > >   https://lists.freedesktop.org/archives/intel-gfx/2018-March/157935.html
> > 
> > Thanks. Keeping that in the relevant commit message of the patch that
> > introduces the new uAPI will make it harder to forget and easiest for
> > git blame, too.
> > 
> > > 
> > > Usually we don't consider things like i-g-t to be sufficient userspace
> > > consumers because we need a real-world consumer rather than a "toy"
> > > userspace.  However in this case, the i-g-t tool, although very simple,
> > > is really the only userspace consumer I expect there to ever be.
> > > Ultimately the true consumer of this cgroups work are bash scripts, sysv
> > > init scripts, systemd recipes, etc.  that just need a very simple tool
> > > to assign the specific values that make sense on a given system.
> > > There's no expectation that graphics clients or display servers would
> > > ever need to make use of these interfaces.
> > 
> > I was under the impression that a bit more generic GPU cgroups support
> > was receiving a lot of support in the early discussion? A dedicated
> > intel_cgroup sounds underwhelming, when comparing to idea of "gpu_nice",
> > for user adoption :)
> 
> I'm open to moving the cgroup_priv registration/lookup to the DRM core
> if other drivers are interested in using this mechanism and if we can
> come to an agreement on a standard priority offset range to support, how
> display boost should work for all drivers, etc.  There might be some
> challenges mapping a DRM-defined priority range down to a different
> range that makes sense for individual driver schedulers, especially
> since some drivers already expose a different priority scheme to
> userspace via other interfaces like i915 does with GEM context priority.
> 
> So far I haven't really heard any interest outside the Intel camp, but
> hopefully other driver teams can speak up if they're for/against this.
> I don't want to try to artificially standardize this if other drivers
> want to go a different direction with priority/scheduling that's too
> different from the current Intel-driven design.

I don't think there are that many directions to go about GPU context
priority, considering we have the EGL_IMG_context_priority extension, so
it'll only be about granularity of the scale.

I would suggest to go with the nice like scale for easy user adoption,
then just apply that as the N most significant bits.

The contexts could then of course further adjust their priority from what
is set by the "gpu_nice" application with the remaining bits.

I'm strongly feeling this should be a DRM level "gpu_nice". And the
binding to cgroups should come through DRM core. If it doesn't, limiting
the amount of memory used becomes awkward as the allocation is
centralized to DRM core.

> > Also, I might not be up-to-date about all things cgroups, but the way
> > intel_cgroup works, feels bit forced. We create a userspace context just
> > to communicate with the driver and the IOCTL will still have global
> > effects. I can't but think that i915 reading from the cgroups subsystem
> > for the current process would feel more intuitive to me.
> 
> I think you're referring to the earlier discussion about exposing
> priority directly via the cgroups filesystem?  That would certainly be
> simpler from a userspace perspective, but it's not the direction that
> the cgroups maintainer wants to see things go.  Adding files directly to
> the cgroups filesystem is supposed to be something that's reserved for
> official cgroups controllers.  The GPU priority concept we're trying to
> add here doesn't align with the requirements for creating a controller,
> so the preferred approach is to create a custom interface (syscall or
> ioctl) that simply takes a cgroup as a parameter.  There's precendent
> with similar interfaces in areas like BPF (where the bpf() system call
> can accept a cgroup as a parameter and then perform its own private
> policy changes as it sees fit).
> 
> Using a true cgroups controller and exposing settings via the filesystem
> is likely still the way we'll want to go for some other types of
> cgroups-based policy in the future (e.g., limiting GPU memory usage); it
> just isn't the appropriate direction for priority.

Might be just me but feels bit crazy to be setting GPU memory usage
through another interface and then doing i915 specific IOCTLs to control
the priority of that same cgroup.

I don't feel comfortable adding custom cgroups dependent IOCTLs to i915
where cgroups is only working as the variable carrier in background. We
should really just be consuming a variable from cgroups and it should be
set outside of of the i915 IOCTL interface.

I'm still seeing that we should have a DRM cgroups controller and a DRM
subsystem wide application to control the priority and memory usage
to be fed to the drivers.

If we end up just supporting i915 apps, we could as well use LD_PRELOAD
wrapper and alter the context priority at creation time for exactly the
same effect and no extra interfaces to maintain.

> > Does the implementation mimic some existing cgroups tool or de-facto way
> > of doing things in cgroups world?
> 
> The ioctl approach I took is similar to syscall approach that the BPF
> guys use to attach BPF programs to a cgroup.  I'm not very familiar with
> BPF or how it gets used from userspace, so I'm not sure whether the
> interface is intended for one specific tool (like ours is), or whether
> there's more variety for userspace consumers.

Is the proposal to set the memory usage from similar interface, or is
that still not implemented?

I'm seeing a very close relation between time-slicing GPU time and
allowed GPU buffer allocations, so having two completely different
interfaces does just feel very hackish way of implementing this.

Regards, Joonas

> 
> 
> Matt
> 
> > 
> > Regards, Joonas
> > --
> > To unsubscribe from this list: send the line "unsubscribe cgroups" in
> > the body of a message to majordomo at vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> -- 
> Matt Roper
> Graphics Software Engineer
> IoTG Platform Enabling & Development
> Intel Corporation
> (916) 356-2795