[Intel-gfx] [PATCH 09/21] drm/i915/gem: Disallow creating contexts with too many engines

Jason Ekstrand jason at jlekstrand.net
Thu Apr 29 19:16:54 UTC 2021


On Thu, Apr 29, 2021 at 3:01 AM Tvrtko Ursulin
<tvrtko.ursulin at linux.intel.com> wrote:
>
>
> On 28/04/2021 18:09, Jason Ekstrand wrote:
> > On Wed, Apr 28, 2021 at 9:26 AM Tvrtko Ursulin
> > <tvrtko.ursulin at linux.intel.com> wrote:
> >> On 28/04/2021 15:02, Daniel Vetter wrote:
> >>> On Wed, Apr 28, 2021 at 11:42:31AM +0100, Tvrtko Ursulin wrote:
> >>>>
> >>>> On 28/04/2021 11:16, Daniel Vetter wrote:
> >>>>> On Fri, Apr 23, 2021 at 05:31:19PM -0500, Jason Ekstrand wrote:
> >>>>>> There's no sense in allowing userspace to create more engines than it
> >>>>>> can possibly access via execbuf.
> >>>>>>
> >>>>>> Signed-off-by: Jason Ekstrand <jason at jlekstrand.net>
> >>>>>> ---
> >>>>>>     drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++----
> >>>>>>     1 file changed, 3 insertions(+), 4 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>> index 5f8d0faf783aa..ecb3bf5369857 100644
> >>>>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> >>>>>> @@ -1640,11 +1640,10 @@ set_engines(struct i915_gem_context *ctx,
> >>>>>>                     return -EINVAL;
> >>>>>>             }
> >>>>>> -  /*
> >>>>>> -   * Note that I915_EXEC_RING_MASK limits execbuf to only using the
> >>>>>> -   * first 64 engines defined here.
> >>>>>> -   */
> >>>>>>             num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
> >>>>>
> >>>>> Maybe add a comment like /* RING_MASK has not shift, so can be used
> >>>>> directly here */ since I had to check that :-)
> >>>>>
> >>>>> Same story about igt testcases needed, just to be sure.
> >>>>>
> >>>>> Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> >>>>
> >>>> I am not sure about the churn vs benefit ratio here. There are also patches
> >>>> which extend the engine selection field in execbuf2 over the unused
> >>>> constants bits (with an explicit flag). So churn upstream and churn in
> >>>> internal (if interesting) for not much benefit.
> >>>
> >>> This isn't churn.
> >>>
> >>> This is "lock done uapi properly".
> >
> > Pretty much.
>
> Still haven't heard what concrete problems it solves.
>
> >> IMO it is a "meh" patch. Doesn't fix any problems and will create work
> >> for other people and man hours spent which no one will ever properly
> >> account against.
> >>
> >> Number of contexts in the engine map should not really be tied to
> >> execbuf2. As is demonstrated by the incoming work to address more than
> >> 63 engines, either as an extension to execbuf2 or future execbuf3.
> >
> > Which userspace driver has requested more than 64 engines in a single context?
>
> No need to artificially limit hardware capabilities in the uapi by
> implementing a policy in the kernel. Which will need to be
> removed/changed shortly anyway. This particular patch is work and
> creates more work (which other people who will get to fix the fallout
> will spend man hours to figure out what and why broke) for no benefit.
> Or you are yet to explain what the benefit is in concrete terms.

You keep complaining about how much work it takes and yet I've spent
more time replying to your e-mails on this patch than I spent writing
the patch and the IGT test.  Also, if it takes so much time to add a
restriction, then why are we spending time figuring out how to modify
the uAPI to allow you to execbuf on a context with more than 64
engines?  If we're worried about engineering man-hours, then limiting
to 64 IS the pragmatic solution.

> Why don't you limit it to number of physical engines then? Why don't you
> filter out duplicates? Why not limit the number of buffer objects per
> client or global based on available RAM + swap relative to minimum
> object size? Reductio ad absurdum yes, but illustrating the, in this
> case, a thin line between "locking down uapi" and adding too much policy
> where it is not appropriate.

All this patch does is say that  you're not allowed to create a
context with more engines than the execbuf API will let you use.  We
already have an artificial limit.  All this does is push the error
handling further up the stack.  If someone comes up with a mechanism
to execbuf on engine 65 (they'd better have an open-source user if it
involves changing API), I'm very happy for them to bump this limit at
the same time.  It'll take them 5 minutes and it'll be something they
find while writing the IGT test.

> > Also, for execbuf3, I'd like to get rid of contexts entirely and have
> > engines be their own userspace-visible object.  If we go this
> > direction, you can have UINT32_MAX of them.  Problem solved.
>
> Not the problem I am pointing at though.

You listed two ways that accessing engine 65 can happen: Extending
execbuf2 and adding a new execbuf3.  When/if execbuf3 happens, as I
pointed out above, it'll hopefully be a non-issue.  If someone extends
execbuf2 to support more than 64 engines and does not have a userspace
customer that wants said new API change, I will NAK the patch.  If
you've got a 3rd way that someone can get at engine 65 such that this
is a problem, I'd love to hear about it.

--Jason


More information about the Intel-gfx mailing list