[PATCH 09/11] drm, cgroup: Introduce lgpu as DRM cgroup resource

Wed Feb 19 16:18:50 UTC 2020

On Fri, Feb 14, 2020 at 02:17:54PM -0500, Tejun Heo wrote:
> Hello, Kenny, Daniel.
> 
> (cc'ing Johannes)
> 
> On Fri, Feb 14, 2020 at 01:51:32PM -0500, Kenny Ho wrote:
> > On Fri, Feb 14, 2020 at 1:34 PM Daniel Vetter <daniel at ffwll.ch> wrote:
> > >
> > > I think guidance from Tejun in previos discussions was pretty clear that
> > > he expects cgroups to be both a) standardized and c) sufficient clear
> > > meaning that end-users have a clear understanding of what happens when
> > > they change the resource allocation.
> > >
> > > I'm not sure lgpu here, at least as specified, passes either.
> > 
> > I disagree (at least on the characterization of the feedback
> > provided.)  I believe this series satisfied the sprite of Tejun's
> > guidance so far (the weight knob for lgpu, for example, was
> > specifically implemented base on his input.)  But, I will let Tejun
> > speak for himself after he considered the implementation in detail.
> 
> I have to agree with Daniel here. My apologies if I weren't clear
> enough. Here's one interface I can think of:
> 
>  * compute weight: The same format as io.weight. Proportional control
>    of gpu compute.
> 
>  * memory low: Please see how the system memory.low behaves. For gpus,
>    it'll need per-device entries.
> 
> Note that for both, there one number to configure and conceptually
> it's pretty clear to everybody what that number means, which is not to
> say that it's clear to implement but it's much better to deal with
> that on this side of the interface than the other.
> 
> cc'ing Johannes. Do you have anything on mind regarding how gpu memory
> configuration should look like? e.g. should it go w/ weights rather
> than absoulte units (I don't think so given that it'll most likely
> need limits at some point too but still and there are benefits from
> staying consistent with system memory).

Yes, I'd go with absolute units when it comes to memory, because it's
not a renewable resource like CPU and IO, and so we do have cliff
behavior around the edge where you transition from ok to not-enough.

memory.low is a bit in flux right now, so if anything is unclear
around its semantics, please feel free to reach out.