[Intel-gfx] [PATCH 16/39] drm/i915/execlists: Virtual engine bonding

Fri Mar 15 09:45:17 UTC 2019

Quoting Tvrtko Ursulin (2019-03-14 17:26:19)
> 
> On 13/03/2019 14:43, Chris Wilson wrote:
> > Some users require that when a master batch is executed on one particular
> > engine, a companion batch is run simultaneously on a specific slave
> > engine. For this purpose, we introduce virtual engine bonding, allowing
> > maps of master:slaves to be constructed to constrain which physical
> > engines a virtual engine may select given a fence on a master engine.
> > 
> > For the moment, we continue to ignore the issue of preemption deferring
> > the master request for later. Ideally, we would like to then also remove
> > the slave and run something else rather than have it stall the pipeline.
> > With load balancing, we should be able to move workload around it, but
> > there is a similar stall on the master pipeline while it may wait for
> > the slave to be executed. At the cost of more latency for the bonded
> > request, it may be interesting to launch both on their engines in
> > lockstep. (Bubbles abound.)
> > 
> > Opens: Also what about bonding an engine as its own master? It doesn't
> > break anything internally, so allow the silliness.
> > 
> > v2: Emancipate the bonds
> > v3: Couple in delayed scheduling for the selftests
> > v4: Handle invalid mutually exclusive bonding
> > v5: Mention what the uapi does
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/i915_gem_context.c       |  50 +++++
> >   drivers/gpu/drm/i915/i915_request.c           |   1 +
> >   drivers/gpu/drm/i915/i915_request.h           |   1 +
> >   drivers/gpu/drm/i915/intel_engine_types.h     |   7 +
> >   drivers/gpu/drm/i915/intel_lrc.c              | 143 ++++++++++++++
> >   drivers/gpu/drm/i915/intel_lrc.h              |   4 +
> >   drivers/gpu/drm/i915/selftests/intel_lrc.c    | 185 ++++++++++++++++++
> >   drivers/gpu/drm/i915/selftests/lib_sw_fence.c |   3 +
> >   include/uapi/drm/i915_drm.h                   |  33 ++++
> >   9 files changed, 427 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 98763d3f1b12..0ec78c386473 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -1513,8 +1513,58 @@ set_engines__load_balance(struct i915_user_extension __user *base, void *data)
> >       return 0;
> >   }
> >   
> > +static int
> > +set_engines__bond(struct i915_user_extension __user *base, void *data)
> > +{
> > +     struct i915_context_engines_bond __user *ext =
> > +             container_of_user(base, typeof(*ext), base);
> > +     const struct set_engines *set = data;
> > +     struct intel_engine_cs *master;
> > +     u32 class, instance, siblings;
> 
> u16 class, instance for no real gain.

Ah, forgot to change types after we consolidated on using u16 for the
engine class/instance. But as I've discovered, we can just use unsigned
int, the type doesn't have to exactly match the __user :)

> > @@ -3218,12 +3251,35 @@ static void virtual_submission_tasklet(unsigned long data)
> >               return;
> >   
> >       local_irq_disable();
> > +
> > +     mask = 0;
> > +     spin_lock(&ve->base.timeline.lock);
> > +     if (ve->request) {
> > +             mask = ve->request->execution_mask;
> > +             if (unlikely(!mask))
> > +                     virtual_submit_error(ve);
> 
> What clears the mask? And virtual_submit_error fails it then?

The user may over-constrain the request with multiple submit-fences, and
the intersection of those master:slave may be 0. submit_error marks the
request as invalid, and puts it on a random queue.

(Need to go back and make it an actual skip request so that dependencies
remain correct; just a danger of user deadlock causing a gpu hang, but
that's already beyond repair.)
-Chris