[Intel-gfx] [PATCH 2/5] drm/i915/selftests: Flush the active callbacks

Chris Wilson chris at chris-wilson.co.uk
Fri Nov 22 13:09:00 UTC 2019


Quoting Tvrtko Ursulin (2019-11-22 13:01:56)
> 
> On 22/11/2019 11:21, Chris Wilson wrote:
> > Before checking the current i915_active state for the asynchronous work
> > we submitted, flush any ongoing callback. This ensures that our sampling
> > is robust and does not sporadically fail due to bad timing as the work
> > is running on another cpu.
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/gt/selftest_context.c | 19 +++++++++++++------
> >   1 file changed, 13 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> > index 3586af636304..939798338242 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> > @@ -48,20 +48,25 @@ static int context_sync(struct intel_context *ce)
> >   
> >       mutex_lock(&tl->mutex);
> >       do {
> > -             struct dma_fence *fence;
> > +             struct i915_request *rq;
> >               long timeout;
> >   
> > -             fence = i915_active_fence_get(&tl->last_request);
> > -             if (!fence)
> > +             if (list_empty(&tl->requests))
> >                       break;
> >   
> > -             timeout = dma_fence_wait_timeout(fence, false, HZ / 10);
> > +             rq = list_last_entry(&tl->requests, typeof(*rq), link);
> > +             i915_request_get(rq);
> > +
> > +             timeout = i915_request_wait(rq, 0, HZ / 10);
> >               if (timeout < 0)
> >                       err = timeout;
> >               else
> > -                     i915_request_retire_upto(to_request(fence));
> > +                     i915_request_retire_upto(rq);
> >   
> > -             dma_fence_put(fence);
> > +             spin_lock_irq(&rq->lock);
> > +             spin_unlock_irq(&rq->lock);
> 
> Ugh.. this at least needs a comment.
> 
> What is the actual race?

We complete the sync before the interrupt handler's irq work has
finished executing the callback to mark the barrier as completed.
So when we look at whether the engine is parked or not, we see the state
before the request has finished processing and find it still active.
-Chris


More information about the Intel-gfx mailing list