[Intel-gfx] [PATCH v3 1/2] drm/i915: Fix negative value passed as remaining time

Janusz Krzysztofik janusz.krzysztofik at linux.intel.com
Wed Nov 23 11:29:19 UTC 2022


On Tuesday, 22 November 2022 11:41:29 CET Tvrtko Ursulin wrote:
> 
> On 21/11/2022 23:19, Janusz Krzysztofik wrote:
> > Hi Andrzej,
> > 
> > Thanks for providing your R-b.
> > 
> > On Monday, 21 November 2022 18:40:51 CET Andrzej Hajda wrote:
> >> On 21.11.2022 15:56, Janusz Krzysztofik wrote:
> >>> Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
> >>> with GuC") extended the API of intel_gt_retire_requests_timeout() with an
> >>> extra argument 'remaining_timeout', intended for passing back unconsumed
> >>> portion of requested timeout when 0 (success) is returned.  However, when
> >>> request retirement happens to succeed despite an error returned by a call
> >>> to dma_fence_wait_timeout(), that error code (a negative value) is passed
> >>> back instead of remaining time.  If we then pass that negative value
> >>> forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG
> >>> will be triggered.
> 
> Right, AFAICT a GEM_BUG_ON in debug builds, but in production builds 
> negative timeout will get passed along all the way to schedule_timeout 
> where error and call trace will be dumped. So fix appears warranted indeed.
> 
> >>> If request retirement succeeds but an error code is passed back via
> >>> remaininig_timeout, we may have no clue on how much of the initial timeout
> >>> might have been left for spending it on waiting for GuC to become idle.
> >>> OTOH, since all pending requests have been successfully retired, that
> >>> error code has been already ignored by intel_gt_retire_requests_timeout(),
> >>> then we shouldn't fail.
> >>>
> >>> Assume no more time has been left on error and pass 0 timeout value to
> >>> intel_uc_wait_for_idle() to give it a chance to return success if GuC is
> >>> already idle.
> >>>
> >>> v3: Don't fail on any error passed back via remaining_timeout.
> >>>
> >>> v2: Fix the issue on the caller side, not the provider.
> >>>
> >>> Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
> > with GuC")
> >>> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
> >>> Cc: stable at vger.kernel.org # v5.15+
> >>
> >> Reviewed-by: Andrzej Hajda <andrzej.hajda at intel.com>
> > 
> > While still open for comments from others, I'm now looking for potential
> > committer.
> 
> Both patches are considered good to go?

Yes, I hope so.  The actual diff of 2/2 v3 has received R-b from Andrzej still 
under discussion of v2, then I decided to add that tag to v3.

Andrzej, can you please respond to 2/2 v3 and confirm your R-b applies?

Thanks,
Janusz

> 
> Regards,
> 
> Tvrtko
> 
> > 
> > Thanks,
> > Janusz
> > 
> > 
> >>
> >> Regards
> >> Andrzej
> >>
> >>> ---
> >>>    drivers/gpu/drm/i915/gt/intel_gt.c | 9 +++++++--
> >>>    1 file changed, 7 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/
> > intel_gt.c
> >>> index b5ad9caa55372..7ef0edb2e37cd 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> >>> @@ -677,8 +677,13 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long
> > timeout)
> >>>    			return -EINTR;
> >>>    	}
> >>>    
> >>> -	return timeout ? timeout : intel_uc_wait_for_idle(&gt->uc,
> >>> -							
> > remaining_timeout);
> >>> +	if (timeout)
> >>> +		return timeout;
> >>> +
> >>> +	if (remaining_timeout < 0)
> >>> +		remaining_timeout = 0;
> >>> +
> >>> +	return intel_uc_wait_for_idle(&gt->uc, remaining_timeout);
> >>>    }
> >>>    
> >>>    int intel_gt_init(struct intel_gt *gt)
> >>
> >>
> > 
> > 
> > 
> > 
> 






More information about the Intel-gfx mailing list