[Intel-gfx] [PATCH] drm/i915/gt: Prevent queuing retire workers on the virtual engine

Chris Wilson chris at chris-wilson.co.uk
Thu Feb 6 16:48:21 UTC 2020


Quoting Tvrtko Ursulin (2020-02-06 16:44:34)
> 
> On 06/02/2020 16:32, Chris Wilson wrote:
> > Virtual engines are fleeting. They carry a reference count and may be freed
> > when their last request is retired. This makes them unsuitable for the
> > task of housing engine->retire.work so assert that it is not used.
> > 
> > Tvrtko tracked down an instance where we did indeed violate this rule.
> > In virtual_submit_request, we flush a completed request directly with
> > __i915_request_submit and this causes us to queue that request on the
> > veng's breadcrumb list and signal it. Leading us down a path where we
> > should not attach the retire.
> > 
> > v2: Always select a physical engine before submitting, and so avoid
> > using the veng as a signaler.
> > 
> > Reported-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > Fixes: dc93c9b69315 ("drm/i915/gt: Schedule request retirement when signaler idles")
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine.h      |  1 +
> >   drivers/gpu/drm/i915/gt/intel_gt_requests.c |  3 +++
> >   drivers/gpu/drm/i915/gt/intel_lrc.c         | 21 ++++++++++++++++++---
> >   drivers/gpu/drm/i915/i915_request.c         |  2 ++
> >   4 files changed, 24 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
> > index b36ec1fddc3d..5b21ca5478c2 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> > @@ -217,6 +217,7 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine);
> >   static inline void
> >   intel_engine_signal_breadcrumbs(struct intel_engine_cs *engine)
> >   {
> > +     GEM_BUG_ON(!engine->breadcrumbs.irq_work.func);
> >       irq_work_queue(&engine->breadcrumbs.irq_work);
> >   }
> >   
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > index 7ef1d37970f6..8a5054f21bf8 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> > @@ -99,6 +99,9 @@ static bool add_retire(struct intel_engine_cs *engine,
> >   void intel_engine_add_retire(struct intel_engine_cs *engine,
> >                            struct intel_timeline *tl)
> >   {
> > +     /* We don't deal well with the engine disappearing beneath us */
> > +     GEM_BUG_ON(intel_engine_is_virtual(engine));
> > +
> >       if (add_retire(engine, tl))
> >               schedule_work(&engine->retire_work);
> >   }
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index c196fb90c59f..639b5be56026 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -4883,6 +4883,22 @@ static void virtual_submission_tasklet(unsigned long data)
> >       local_irq_enable();
> >   }
> >   
> > +static void __ve_request_submit(const struct virtual_engine *ve,
> > +                             struct i915_request *rq)
> > +{
> > +     struct intel_engine_cs *engine = ve->siblings[0]; /* totally random! */
> 
> We don't preserve the execution engine in ce->inflight? No.. Will random 
> engine have any effect? Will proper waiters get signaled?

Ok, it's not totally random ;) it's the engine on which we last executed
on, so it's a match wrt to the previous breadcrumbs/waiters. It's a good
choice :)

> > +     /*
> > +      * Select a real engine to act as our permanent storage
> > +      * and signaler for the stale request, and prevent
> > +      * this virtual engine from leaking into the execution state.
> > +      */
> > +     spin_lock(&engine->active.lock);
> 
> Nesting phys lock under veng lock will be okay?

No. Far from it.
-Chris


More information about the Intel-gfx mailing list