[Intel-gfx] [PATCH] drm/i915: Check against the signaled bit for fences/requests
Chris Wilson
chris at chris-wilson.co.uk
Wed Feb 22 17:40:34 UTC 2017
On Wed, Feb 22, 2017 at 05:30:59PM +0000, Tvrtko Ursulin wrote:
>
> On 22/02/2017 16:42, Chris Wilson wrote:
> >When dma_fence_signal() is called, it sets a flag to indicate the fenece
> >is complete. Before the dma_fence is signaled, the seqno check will
> >first be passed. However, if the request is being held by an observer
> >across a global seqno reset, it is possible for the observer to not
> >notice the seqno completion (as the HWS is now zero), but instead we
> >rely on the signaled bit. External holders of the request should be
> >using the fence interface which does first check the signaled bit - it
> >will only be an unlocked waiter that may not notice the request
> >completion.
> >
> >By that analysis, the problem is old as our unlocked waits, but this
> >patch depends upon subclassing dma-fence, so use that as the base
> >commit.
> >
> >Fixes: 04769652c8c7 ("drm/i915: Derive GEM requests from dma-fence")
> >Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> >Cc: <stable at vger.kernel.org> # v4.9+
> >---
> > drivers/gpu/drm/i915/i915_drv.h | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >index 994929660027..dcc8f73b26bf 100644
> >--- a/drivers/gpu/drm/i915/i915_drv.h
> >+++ b/drivers/gpu/drm/i915/i915_drv.h
> >@@ -4026,6 +4026,9 @@ __i915_request_irq_complete(struct drm_i915_gem_request *req)
> > {
> > struct intel_engine_cs *engine = req->engine;
> >
> >+ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &req->fence.flags))
> >+ return true;
> >+
> > /* Before we do the heavier coherent read of the seqno,
> > * check the value (hopefully) in the CPU cacheline.
> > */
> >
>
> It is obviously correct in a way that it can never give a false
> positive, even though I haven't quite figured how we had false
> negatives without it. I'll leave that for a new day. Hopefully there
> are no other places in our code which could be affected in the same
> way?
For starters, we can exclude all checks that hold struct_mutex (from the
time it gets the request from the source to the
i915_gem_request_completed, they are protected by the init_global_seqno
requiring the struct_mutex to do the rollover, and by retirement
removing the state trackers). That rules out the majority.
For the unlocked accessors, we can remove all the dma-fence clients as
they already have a check against the signaled bit before checking
against the seqno.
The only place I found that was susceptible was the unlocked waiter....
And actually that is protected by init_global_seqno clearing the waiters
before doing the rollover.
Okay, no need to panic. I'll mollify the changelog and drop stable - I
still want the patch for the later changes.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list