[Intel-gfx] [PATCH v3 1/2] drm/i915: Fix negative value passed as remaining time
Janusz Krzysztofik
janusz.krzysztofik at linux.intel.com
Wed Nov 23 11:29:19 UTC 2022
On Tuesday, 22 November 2022 11:41:29 CET Tvrtko Ursulin wrote:
>
> On 21/11/2022 23:19, Janusz Krzysztofik wrote:
> > Hi Andrzej,
> >
> > Thanks for providing your R-b.
> >
> > On Monday, 21 November 2022 18:40:51 CET Andrzej Hajda wrote:
> >> On 21.11.2022 15:56, Janusz Krzysztofik wrote:
> >>> Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
> >>> with GuC") extended the API of intel_gt_retire_requests_timeout() with an
> >>> extra argument 'remaining_timeout', intended for passing back unconsumed
> >>> portion of requested timeout when 0 (success) is returned. However, when
> >>> request retirement happens to succeed despite an error returned by a call
> >>> to dma_fence_wait_timeout(), that error code (a negative value) is passed
> >>> back instead of remaining time. If we then pass that negative value
> >>> forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG
> >>> will be triggered.
>
> Right, AFAICT a GEM_BUG_ON in debug builds, but in production builds
> negative timeout will get passed along all the way to schedule_timeout
> where error and call trace will be dumped. So fix appears warranted indeed.
>
> >>> If request retirement succeeds but an error code is passed back via
> >>> remaininig_timeout, we may have no clue on how much of the initial timeout
> >>> might have been left for spending it on waiting for GuC to become idle.
> >>> OTOH, since all pending requests have been successfully retired, that
> >>> error code has been already ignored by intel_gt_retire_requests_timeout(),
> >>> then we shouldn't fail.
> >>>
> >>> Assume no more time has been left on error and pass 0 timeout value to
> >>> intel_uc_wait_for_idle() to give it a chance to return success if GuC is
> >>> already idle.
> >>>
> >>> v3: Don't fail on any error passed back via remaining_timeout.
> >>>
> >>> v2: Fix the issue on the caller side, not the provider.
> >>>
> >>> Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
> > with GuC")
> >>> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
> >>> Cc: stable at vger.kernel.org # v5.15+
> >>
> >> Reviewed-by: Andrzej Hajda <andrzej.hajda at intel.com>
> >
> > While still open for comments from others, I'm now looking for potential
> > committer.
>
> Both patches are considered good to go?
Yes, I hope so. The actual diff of 2/2 v3 has received R-b from Andrzej still
under discussion of v2, then I decided to add that tag to v3.
Andrzej, can you please respond to 2/2 v3 and confirm your R-b applies?
Thanks,
Janusz
>
> Regards,
>
> Tvrtko
>
> >
> > Thanks,
> > Janusz
> >
> >
> >>
> >> Regards
> >> Andrzej
> >>
> >>> ---
> >>> drivers/gpu/drm/i915/gt/intel_gt.c | 9 +++++++--
> >>> 1 file changed, 7 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/
> > intel_gt.c
> >>> index b5ad9caa55372..7ef0edb2e37cd 100644
> >>> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> >>> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> >>> @@ -677,8 +677,13 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long
> > timeout)
> >>> return -EINTR;
> >>> }
> >>>
> >>> - return timeout ? timeout : intel_uc_wait_for_idle(>->uc,
> >>> -
> > remaining_timeout);
> >>> + if (timeout)
> >>> + return timeout;
> >>> +
> >>> + if (remaining_timeout < 0)
> >>> + remaining_timeout = 0;
> >>> +
> >>> + return intel_uc_wait_for_idle(>->uc, remaining_timeout);
> >>> }
> >>>
> >>> int intel_gt_init(struct intel_gt *gt)
> >>
> >>
> >
> >
> >
> >
>
More information about the dri-devel
mailing list