[Intel-gfx] [PATCH] drm/i915/guc: clear stalled request after a reset

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Fri Aug 12 07:29:11 UTC 2022


On 11/08/2022 22:08, Daniele Ceraolo Spurio wrote:
> If the GuC CTs are full and we need to stall the request submission
> while waiting for space, we save the stalled request and where the stall
> occurred; when the CTs have space again we pick up the request submission
> from where we left off.

How serious is it? Statement always was CT buffers can never get full 
outside the pathological IGT test cases. So I am wondering if this is in 
the category of fix for correctness or actually the CT buffers can get 
full in normal use so it is imperative to fix.

Regards,

Tvrtko

> If a full GT reset occurs, the state of all contexts is cleared and all
> non-guilty requests are unsubmitted, therefore we need to restart the
> stalled request submission from scratch. To make sure that we do so,
> clear the saved request after a reset.
> 
> Fixes note: the patch that introduced the bug is in 5.15, but no
> officially supported platform had GuC submission enabled by default
> in that kernel, so the backport to that particular version (and only
> that one) can potentially be skipped.
> 
> Fixes: 925dc1cf58ed ("drm/i915/guc: Implement GuC submission tasklet")
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: John Harrison <john.c.harrison at intel.com>
> Cc: <stable at vger.kernel.org> # v5.15+
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 0d17da77e787..0d56b615bf78 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -4002,6 +4002,13 @@ static inline void guc_init_lrc_mapping(struct intel_guc *guc)
>   	/* make sure all descriptors are clean... */
>   	xa_destroy(&guc->context_lookup);
>   
> +	/*
> +	 * A reset might have occurred while we had a pending stalled request,
> +	 * so make sure we clean that up.
> +	 */
> +	guc->stalled_request = NULL;
> +	guc->submission_stall_reason = STALL_NONE;
> +
>   	/*
>   	 * Some contexts might have been pinned before we enabled GuC
>   	 * submission, so we need to add them to the GuC bookeeping.


More information about the Intel-gfx mailing list