[PATCH RFC v4 07/16] drm, cgroup: Add total GEM buffer allocation limit

Fri Nov 29 07:18:15 UTC 2019

On Tue, Oct 1, 2019 at 10:30 AM Michal Koutný <mkoutny at suse.com> wrote:
> On Thu, Aug 29, 2019 at 02:05:24AM -0400, Kenny Ho <Kenny.Ho at amd.com> wrote:
> > drm.buffer.default
> >         A read-only flat-keyed file which exists on the root cgroup.
> >         Each entry is keyed by the drm device's major:minor.
> >
> >         Default limits on the total GEM buffer allocation in bytes.
> What is the purpose of this attribute (and alikes for other resources)?
> I can't see it being set differently but S64_MAX in
> drmcg_device_early_init.

cgroup has a number of conventions and one of which is the idea of a
default.  The idea here is to allow for device specific defaults.  For
this specific resource, I can probably not expose it since it's not
particularly useful, but for other resources (such as the lgpu
resource) the concept of a default is useful (for example, different
devices can have different number of lgpu.)

> > +static ssize_t drmcg_limit_write(struct kernfs_open_file *of, char *buf,
> > [...]
> > +             switch (type) {
> > +             case DRMCG_TYPE_BO_TOTAL:
> > +                     p_max = parent == NULL ? S64_MAX :
> > +                             parent->dev_resources[minor]->
> > +                             bo_limits_total_allocated;
> > +
> > +                     rc = drmcg_process_limit_s64_val(sattr, true,
> > +                             props->bo_limits_total_allocated_default,
> > +                             p_max,
> > +                             &val);
> IIUC, this allows initiating the particular limit value based either on
> parent or the default per-device value. This is alas rather an
> antipattern. The most stringent limit on the path from a cgroup to the
> root should be applied at the charging time. However, the child should
> not inherit the verbatim value from the parent (may race with parent and
> it won't be updated upon parent change).
I think this was a mistake during one of my refactor and I shrunk the
critical section protected by a mutex a bit too much.  But you are
right in the sense that I don't propagate the limits downward to the
children when the parent's limit is updated.  But from the user
interface perspective, wouldn't this be confusing?  When a sysadmin
sets a limit using the 'max' keyword, the value would be a global one
even though the actual allowable maximum for the particular cgroup is
less in reality because of the ancestor cgroups?  (If this is the
established norm, I am ok to go along but seems confusing to me.)  I
am probably missing something because as I implemented this, the 'max'
and 'default' semantic has been confusing to me especially for the
children cgroups due to the context of the ancestors.

> You already do the appropriate hierarchical check in
> drmcg_try_chb_bo_alloc, so the parent propagation could be simply
> dropped if I'm not mistaken.
I will need to double check.  But I think interaction between parent
and children (or perhaps between siblings) will be needed eventually
because there seems to be a desire to implement "weight" type of
resource.  Also, from performance perspective, wouldn't it make more
sense to make sure the limits are set correctly during configuration
than to have to check all the cgroups up through the parents?  I don't
have comprehensive knowledge of the implementation of other cgroup
controllers so if more experience folks can comment that would be
great.  (Although, I probably should just do one approach instead of
doing both... or 1.5.)

>
> Also, I can't find how the read of
> parent->dev_resources[minor]->bo_limits_total_allocated and its
> concurrent update are synchronized (i.e. someone writing
> buffer.total.max for parent and child in parallel). (It may just my
> oversight.)
This is probably the refactor mistake I mentioned earlier.

Regards,
Kenny