[Intel-gfx] [PATCH 04/31] drm/i915/execlists: Pull CSB reset under the timeline.lock

Tue Jun 26 11:04:43 UTC 2018

Quoting Tvrtko Ursulin (2018-06-26 11:59:05)
> 
> On 25/06/2018 10:48, Chris Wilson wrote:
> > In the following patch, we will process the CSB events under the
> > timeline.lock and not serailiased by the tasklet. This also means that we
> 
> Typo in serialised.
> 
> > will need to protect access to common variables such as
> > execlists->csb_head with the timeline.lock during reset.
> > 
> > v2: Move sync_irq to avoid deadlocks between taking timeline.lock from
> > our interrupt handler.
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/intel_lrc.c | 11 +++++------
> >   1 file changed, 5 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index b5c809201c7a..2cbb293fb409 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -871,7 +871,6 @@ static void reset_irq(struct intel_engine_cs *engine)
> >   {
> >       /* Mark all CS interrupts as complete */
> >       smp_store_mb(engine->execlists.active, 0);
> > -     synchronize_hardirq(engine->i915->drm.irq);
> >   
> >       clear_gtiir(engine);
> >   
> > @@ -912,6 +911,8 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
> >   
> >       /* Cancel the requests on the HW and clear the ELSP tracker. */
> >       execlists_cancel_port_requests(execlists);
> > +
> > +     synchronize_hardirq(engine->i915->drm.irq);
> 
> This is why I hate trickery. It used to be smp_store_mb then 
> synchronize_hardirq, and now it would be the opposite. I have no idea 
> what's broken before, and what's after, or if all is just pointless. 
> Hmmm... even funnier, it seems that in the current code we already have 
> synchronize_hardirq from the irqdisabled section. Should that be fixed 
> with an explicit patch first?
> 
> >       reset_irq(engine);
> >   
> >       spin_lock(&engine->timeline.lock);
> > @@ -1969,8 +1970,8 @@ static void execlists_reset(struct intel_engine_cs *engine,
> >                 engine->name, request ? request->global_seqno : 0,
> >                 intel_engine_get_seqno(engine));
> >   
> > -     /* See execlists_cancel_requests() for the irq/spinlock split. */
> > -     local_irq_save(flags);
> > +     synchronize_hardirq(engine->i915->drm.irq);
> > +     spin_lock_irqsave(&engine->timeline.lock, flags);
> 
> What is the point of synchronize_hardirq here? If it was running the 
> spinlock would wait for it, if it was pending the opposite or nothing, 
> but if we wait for it before the spinlock what says new one cannot 
> become pending between synchronize_hardirq and the spinlock?

It's ok from irqoff. We can just kill it, as it was just icing on the
cake. Or lipstick on a pig. I guess it should die in case anyone does a
git grep sychronize_hardirq...

It's the serialisation of CSB processing with the reset that is
important to make sure we don't replay stale events afterwards.
-Chris