[Intel-gfx] [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
Chris Wilson
chris at chris-wilson.co.uk
Wed Mar 21 17:07:29 UTC 2018
Quoting Chris Wilson (2018-03-21 17:05:06)
> Quoting Michel Thierry (2018-03-21 17:01:12)
> > On 3/21/2018 3:46 AM, Mika Kuoppala wrote:
> > > Chris Wilson <chris at chris-wilson.co.uk> writes:
> > >
> > >> We were relying on the uncached reads when processing the CSB to provide
> > >> ourselves with the serialisation with the interrupt handler (so we could
> > >> detect new interrupts in the middle of processing the old one). However,
> > >> in commit 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD
> > >> from the HWSP") those uncached reads were eliminated (on one path at
> > >> least) and along with them our serialisation. The result is that we
> > >> would very rarely miss notification of a new interrupt and leave a
> > >> context-switch unprocessed, hanging the GPU.
> > >>
> > >> Fixes: 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD from the HWSP")
> > >> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > >> Cc: Michel Thierry <michel.thierry at intel.com>
> > >> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > >> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> > >> ---
> > >> drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++-------------
> > >> 1 file changed, 8 insertions(+), 13 deletions(-)
> > >>
> > >> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > >> index 53f1c009ed7b..67b6a0f658d6 100644
> > >> --- a/drivers/gpu/drm/i915/intel_lrc.c
> > >> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > >> @@ -831,7 +831,8 @@ static void execlists_submission_tasklet(unsigned long data)
> > >> struct drm_i915_private *dev_priv = engine->i915;
> > >> bool fw = false;
> > >>
> > >> - /* We can skip acquiring intel_runtime_pm_get() here as it was taken
> > >> + /*
> > >> + * We can skip acquiring intel_runtime_pm_get() here as it was taken
> > >> * on our behalf by the request (see i915_gem_mark_busy()) and it will
> > >> * not be relinquished until the device is idle (see
> > >> * i915_gem_idle_work_handler()). As a precaution, we make sure
> > >> @@ -840,7 +841,8 @@ static void execlists_submission_tasklet(unsigned long data)
> > >> */
> > >> GEM_BUG_ON(!dev_priv->gt.awake);
> > >>
> > >> - /* Prefer doing test_and_clear_bit() as a two stage operation to avoid
> > >> + /*
> > >> + * Prefer doing test_and_clear_bit() as a two stage operation to avoid
> > >> * imposing the cost of a locked atomic transaction when submitting a
> > >> * new request (outside of the context-switch interrupt).
> > >> */
> > >> @@ -856,17 +858,10 @@ static void execlists_submission_tasklet(unsigned long data)
> > >> execlists->csb_head = -1; /* force mmio read of CSB ptrs */
> > >> }
> > >>
> > >> - /* The write will be ordered by the uncached read (itself
> > >> - * a memory barrier), so we do not need another in the form
> > >> - * of a locked instruction. The race between the interrupt
> > >> - * handler and the split test/clear is harmless as we order
> > >> - * our clear before the CSB read. If the interrupt arrived
> > >> - * first between the test and the clear, we read the updated
> > >> - * CSB and clear the bit. If the interrupt arrives as we read
> > >> - * the CSB or later (i.e. after we had cleared the bit) the bit
> > >> - * is set and we do a new loop.
> > >> - */
> > >> - __clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> > >> + /* Clear before reading to catch new interrupts */
> > >> + clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> > >> + smp_mb__after_atomic();
> >
> > Checkpatch wants a comment for the memory barrier... Are we being strict
> > about it? (https://patchwork.freedesktop.org/series/40359/)
>
> There's a comment for it not two lines above! Silly perl script.
Besides it being only a simulacrum of a mb. Silly perl script :)
-Chris
More information about the Intel-gfx
mailing list