[RFC PATCH v2 4/5] drm, cgroup: Add total GEM buffer allocation limit

Thu May 16 07:16:48 UTC 2019

Am 16.05.19 um 04:29 schrieb Kenny Ho:
> [CAUTION: External Email]
>
> On Wed, May 15, 2019 at 5:26 PM Welty, Brian <brian.welty at intel.com> wrote:
>> On 5/9/2019 2:04 PM, Kenny Ho wrote:
>>> There are four control file types,
>>> stats (ro) - display current measured values for a resource
>>> max (rw) - limits for a resource
>>> default (ro, root cgroup only) - default values for a resource
>>> help (ro, root cgroup only) - help string for a resource
>>>
>>> Each file is multi-lined with one entry/line per drm device.
>> Multi-line is correct for multiple devices, but I believe you need
>> to use a KEY to denote device for both your set and get routines.
>> I didn't see your set functions reading a key, or the get functions
>> printing the key in output.
>> cgroups-v2 conventions mention using KEY of major:minor, but I think
>> you can use drm_minor as key?
> Given this controller is specific to the drm kernel subsystem which
> uses minor to identify drm device,

Wait a second, using the DRM minor is a good idea in the first place.

I have a test system with a Vega10 and a Vega20. Which device gets which 
minor is not stable, but rather defined by the scan order of the PCIe bus.

Normally the scan order is always the same, but adding or removing 
devices or delaying things just a little bit during init is enough to 
change this.

We need something like the Linux sysfs location or similar to have a 
stable implementation.

Regards,
Christian.

>   I don't see a need to complicate
> the interfaces more by having major and a key.  As you can see in the
> examples below, the drm device minor corresponds to the line number.
> I am not sure how strict cgroup upstream is about the convention but I
> am hoping there are flexibility here to allow for what I have
> implemented.  There are a couple of other things I have done that is
> not described in the convention: 1) inclusion of read-only *.help file
> at the root cgroup, 2) use read-only (which I can potentially make rw)
> *.default file instead of having a default entries (since the default
> can be different for different devices) inside the control files (this
> way, the resetting of cgroup values for all the drm devices, can be
> done by a simple 'cp'.)
>
>>> Usage examples:
>>> // set limit for card1 to 1GB
>>> sed -i '2s/.*/1073741824/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>>>
>>> // set limit for card0 to 512MB
>>> sed -i '1s/.*/536870912/' /sys/fs/cgroup/<cgroup>/drm.buffer.total.max
>
>>>   /** @file drm_gem.c
>>> @@ -154,6 +156,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
>>>        obj->handle_count = 0;
>>>        obj->size = size;
>>>        drm_vma_node_reset(&obj->vma_node);
>>> +
>>> +     obj->drmcgrp = get_drmcgrp(current);
>>> +     drmcgrp_chg_bo_alloc(obj->drmcgrp, dev, size);
>> Why do the charging here?
>> There is no backing store yet for the buffer, so this is really tracking something akin to allowed virtual memory for GEM objects?
>> Is this really useful for an administrator to control?
>> Isn't the resource we want to control actually the physical backing store?
> That's correct.  This is just the first level of control since the
> backing store can be backed by different type of memory.  I am in the
> process of adding at least two more resources.  Stay tuned.  I am
> doing the charge here to enforce the idea of "creator is deemed owner"
> at a place where the code is shared by all (the init function.)
>
>>> +     while (i <= max_minor && limits != NULL) {
>>> +             sval =  strsep(&limits, "\n");
>>> +             rc = kstrtoll(sval, 0, &val);
>> Input should be "KEY VALUE", so KEY will determine device to apply this to.
>> Also, per cgroups-v2 documentation of limits, I believe need to parse and handle the special "max" input value.
>>
>> parse_resources() in rdma controller is example for both of above.
> Please see my previous reply for the rationale of my hope to not need
> a key.  I can certainly add handling of "max" and "default".
>
>
>>> +void drmcgrp_chg_bo_alloc(struct drmcgrp *drmcgrp, struct drm_device *dev,
>>> +             size_t size)
>> Shouldn't this return an error and be implemented with same semantics as the
>> try_charge() functions of other controllers?
>> Below will allow stats_total_allocated to overrun limits_total_allocated.
> This is because I am charging the buffer at the init of the buffer
> which does not fail so the "try" (drmcgrp_bo_can_allocate) is separate
> and placed earlier and nearer other condition where gem object
> allocation may fail.  In other words, there are multiple possibilities
> for which gem allocation may fail (cgroup limit being one of them) and
> satisfying cgroup limit does not mean a charge is needed.  I can
> certainly combine the two functions to have an additional try_charge
> semantic as well if that is really needed.
>
> Regards,
> Kenny