[Intel-gfx] [PATCH] drm/i915: fix remaining_timeout in intel_gt_retire_requests_timeout

Das, Nirmoy nirmoy.das at linux.intel.com
Mon Mar 28 09:16:06 UTC 2022


On 3/25/2022 9:33 PM, Ceraolo Spurio, Daniele wrote:
>
>
> On 3/25/2022 11:37 AM, Das, Nirmoy wrote:
>>
>> On 3/25/2022 6:58 PM, Daniele Ceraolo Spurio wrote:
>>> In intel_gt_wait_for_idle, we use the remaining timeout returned from
>>> intel_gt_retire_requests_timeout to wait on the GuC being idle. 
>>> However,
>>> the returned variable can have a negative value if something goes wrong
>>> during the wait, leading to us hitting a GEM_BUG_ON in the GuC wait
>>> function.
>>> To fix this, make sure to only return the timeout if it is positive.
>>>
>>> Fixes: b97060a99b01b ("drm/i915/guc: Update intel_gt_wait_for_idle 
>>> to work with GuC")
>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>>> Cc: Matthew Brost <matthew.brost at intel.com>
>>> Cc: John Harrison <john.c.harrison at intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c 
>>> b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> index edb881d756309..ef70c209976d8 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> @@ -197,7 +197,7 @@ out_active: spin_lock(&timelines->lock);
>>>           active_count++;
>>>         if (remaining_timeout)
>>> -        *remaining_timeout = timeout;
>>> +        *remaining_timeout = timeout > 0 ? timeout : 0;
>>
>>
>> Should the last flush_submission() be  "if ( timeout > 0 
>> &&flush_submission(gt, timeout))" ?
>
> I considered it, but flush_submission only checks for timeout != 0 so 
> it won't accidentally make use of a negative value thinking it's 
> positive. I don't know if the flush is purposely done even if timeout 
> is negative or if that's a mistake, but that code has been there long 
> before we modified the function to return the remaining timeout and 
> never seems to have caused issues, so I decided not to change it.


Yes, we need clarify if we really need the final flush if the timeout is 
negative.

But this patch  is Acked-by: Nirmoy Das <nirmoy.das at intel.com>

Nirmoy

>
> Daniele
>
>>
>>
>> Nirmoy
>>
>>>         return active_count ? timeout : 0;
>>>   }
>


More information about the dri-devel mailing list