[Intel-gfx] [PATCH 2/5] drm/i915/selftests: Flush the active callbacks

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Fri Nov 22 13:43:17 UTC 2019


On 22/11/2019 13:09, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-22 13:01:56)
>>
>> On 22/11/2019 11:21, Chris Wilson wrote:
>>> Before checking the current i915_active state for the asynchronous work
>>> we submitted, flush any ongoing callback. This ensures that our sampling
>>> is robust and does not sporadically fail due to bad timing as the work
>>> is running on another cpu.
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/gt/selftest_context.c | 19 +++++++++++++------
>>>    1 file changed, 13 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
>>> index 3586af636304..939798338242 100644
>>> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
>>> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
>>> @@ -48,20 +48,25 @@ static int context_sync(struct intel_context *ce)
>>>    
>>>        mutex_lock(&tl->mutex);
>>>        do {
>>> -             struct dma_fence *fence;
>>> +             struct i915_request *rq;
>>>                long timeout;
>>>    
>>> -             fence = i915_active_fence_get(&tl->last_request);
>>> -             if (!fence)
>>> +             if (list_empty(&tl->requests))
>>>                        break;
>>>    
>>> -             timeout = dma_fence_wait_timeout(fence, false, HZ / 10);
>>> +             rq = list_last_entry(&tl->requests, typeof(*rq), link);
>>> +             i915_request_get(rq);
>>> +
>>> +             timeout = i915_request_wait(rq, 0, HZ / 10);
>>>                if (timeout < 0)
>>>                        err = timeout;
>>>                else
>>> -                     i915_request_retire_upto(to_request(fence));
>>> +                     i915_request_retire_upto(rq);
>>>    
>>> -             dma_fence_put(fence);
>>> +             spin_lock_irq(&rq->lock);
>>> +             spin_unlock_irq(&rq->lock);
>>
>> Ugh.. this at least needs a comment.
>>
>> What is the actual race?
> 
> We complete the sync before the interrupt handler's irq work has
> finished executing the callback to mark the barrier as completed.
> So when we look at whether the engine is parked or not, we see the state
> before the request has finished processing and find it still active.

You mean i915_active_fence_get returns not really last request? How can 
that be? Or the key is just intel_engine_pm_flush?

Regards,

Tvrtko


More information about the Intel-gfx mailing list