[Intel-gfx] [PATCH] drm/i915: fix remaining_timeout in intel_gt_retire_requests_timeout
Das, Nirmoy
nirmoy.das at linux.intel.com
Mon Mar 28 09:16:06 UTC 2022
On 3/25/2022 9:33 PM, Ceraolo Spurio, Daniele wrote:
>
>
> On 3/25/2022 11:37 AM, Das, Nirmoy wrote:
>>
>> On 3/25/2022 6:58 PM, Daniele Ceraolo Spurio wrote:
>>> In intel_gt_wait_for_idle, we use the remaining timeout returned from
>>> intel_gt_retire_requests_timeout to wait on the GuC being idle.
>>> However,
>>> the returned variable can have a negative value if something goes wrong
>>> during the wait, leading to us hitting a GEM_BUG_ON in the GuC wait
>>> function.
>>> To fix this, make sure to only return the timeout if it is positive.
>>>
>>> Fixes: b97060a99b01b ("drm/i915/guc: Update intel_gt_wait_for_idle
>>> to work with GuC")
>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>>> Cc: Matthew Brost <matthew.brost at intel.com>
>>> Cc: John Harrison <john.c.harrison at intel.com>
>>> ---
>>> drivers/gpu/drm/i915/gt/intel_gt_requests.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> index edb881d756309..ef70c209976d8 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
>>> @@ -197,7 +197,7 @@ out_active: spin_lock(&timelines->lock);
>>> active_count++;
>>> if (remaining_timeout)
>>> - *remaining_timeout = timeout;
>>> + *remaining_timeout = timeout > 0 ? timeout : 0;
>>
>>
>> Should the last flush_submission() be "if ( timeout > 0
>> &&flush_submission(gt, timeout))" ?
>
> I considered it, but flush_submission only checks for timeout != 0 so
> it won't accidentally make use of a negative value thinking it's
> positive. I don't know if the flush is purposely done even if timeout
> is negative or if that's a mistake, but that code has been there long
> before we modified the function to return the remaining timeout and
> never seems to have caused issues, so I decided not to change it.
Yes, we need clarify if we really need the final flush if the timeout is
negative.
But this patch is Acked-by: Nirmoy Das <nirmoy.das at intel.com>
Nirmoy
>
> Daniele
>
>>
>>
>> Nirmoy
>>
>>> return active_count ? timeout : 0;
>>> }
>
More information about the Intel-gfx
mailing list