[Intel-gfx] [PATCH 1/2] drm/i915: Recheck breadcrumb seqno after an interrupt

Thu Feb 16 12:47:41 UTC 2017

On 16/02/2017 11:21, Chris Wilson wrote:
> On Thu, Feb 16, 2017 at 10:45:56AM +0000, Tvrtko Ursulin wrote:
>>
>> On 16/02/2017 10:36, Chris Wilson wrote:
>>> On Thu, Feb 16, 2017 at 10:21:17AM +0000, Tvrtko Ursulin wrote:
>>>>
>>>> On 16/02/2017 09:29, Chris Wilson wrote:
>>>>> If an interrupt arrives whilst we are performing the irq-seqno barrier,
>>>>> recheck the seqno again before returning.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>> ---
>>>>> drivers/gpu/drm/i915/i915_drv.h | 6 ++++--
>>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index c1b400f1ede4..ecb8b414bdd2 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -4102,6 +4102,9 @@ __i915_request_irq_complete(const struct drm_i915_gem_request *req)
>>>>> 	if (__i915_gem_request_completed(req, seqno))
>>>>> 		return true;
>>>>>
>>>>> +	if (!engine->irq_seqno_barrier)
>>>>> +		return false;
>>>>> +
>>>>> 	/* Ensure our read of the seqno is coherent so that we
>>>>> 	 * do not "miss an interrupt" (i.e. if this is the last
>>>>> 	 * request and the seqno write from the GPU is not visible
>>>>> @@ -4113,8 +4116,7 @@ __i915_request_irq_complete(const struct drm_i915_gem_request *req)
>>>>> 	 * but it is easier and safer to do it every time the waiter
>>>>> 	 * is woken.
>>>>> 	 */
>>>>> -	if (engine->irq_seqno_barrier &&
>>>>> -	    test_and_clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted)) {
>>>>> +	while (test_and_clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted)) {
>>>>> 		unsigned long flags;
>>>>>
>>>>> 		/* The ordering of irq_posted versus applying the barrier
>>>>>
>>>>
>>>> Hmmm.. if this helps it feels that there is a race somewhere.
>>>> Because we have processed one interrupt, but it wasn't for us.
>>>
>>> There may be many interrupts between now and the one we actually want.
>>> The caller of this function is (or was at the time) has the first seqno
>>> to be signaled, it is just not necessarily the next interrupt.
>>>
>>>> That
>>>> means there will be another one coming. Why handle that at this
>>>> level? Why check twice and not three, four, five times? :)
>>>
>>> Because they are all for us! This only saves a trip through schedule. If
>>> we don't loop here, we just loop at the next level.
>>
>> Yes, which looks more correct to me. Because the caller then do what
>> they want. Signaller just sleeps and i915_wait_request potentially
>> busy waits for a bit.
>
> A persuasive counter argument would be:
>
> Interrupts can be arriving frequently for an indefinite period before we
> see our seqno that breaks the loop. We should not allow this to spin
> indefinitely and at the very least need to be checking for pending
> interrupts or timeslice exhaustion. Both of which are handled in the
> callers.

I got lost in this branch of the thread. Are you making an argument for 
me doubting the patch, or for you supporting the patch? :)

Regards,

Tvrtko