[Intel-gfx] [PATCH] drm/i915/gt: Always reset the engine, even if inactive, on execlists failure

Mika Kuoppala mika.kuoppala at linux.intel.com
Mon Jul 13 09:34:17 UTC 2020


Chris Wilson <chris at chris-wilson.co.uk> writes:

> If something has gone awry with the CSB processing, we need to pause,
> unwind and restart the request submission and event processing. However,
> currently we skip the engine reset if we raise an error but discover no
> active context, in the mistaken belief that it was merely a glitch in
> the matrix. The glitches are real enough, and we do need to unwind even
> if the engine appears idle (as it has gone permanently idle!) The
> simplest way to unwind and recover is simply do the engine reset, which
> should be very fast and _safe_ as nothing is active.
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 15 ++++++---------
>  1 file changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index cd4262cc96e2..3ea05a86dc95 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -3029,12 +3029,12 @@ static u32 active_ccid(struct intel_engine_cs *engine)
>  	return ENGINE_READ_FW(engine, RING_EXECLIST_STATUS_HI);
>  }
>  
> -static bool execlists_capture(struct intel_engine_cs *engine)
> +static void execlists_capture(struct intel_engine_cs *engine)
>  {
>  	struct execlists_capture *cap;
>  
>  	if (!IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR))
> -		return true;
> +		return;
>  
>  	/*
>  	 * We need to _quickly_ capture the engine state before we reset.
> @@ -3043,7 +3043,7 @@ static bool execlists_capture(struct intel_engine_cs *engine)
>  	 */
>  	cap = capture_regs(engine);
>  	if (!cap)
> -		return true;
> +		return;
>  
>  	spin_lock_irq(&engine->active.lock);
>  	cap->rq = active_context(engine, active_ccid(engine));
> @@ -3080,14 +3080,13 @@ static bool execlists_capture(struct intel_engine_cs *engine)
>  
>  	INIT_WORK(&cap->work, execlists_capture_work);
>  	schedule_work(&cap->work);
> -	return true;
> +	return;
>  
>  err_rq:
>  	i915_request_put(cap->rq);
>  err_free:
>  	i915_gpu_coredump_put(cap->error);
>  	kfree(cap);
> -	return false;
>  }
>  
>  static void execlists_reset(struct intel_engine_cs *engine, const char *msg)
> @@ -3107,10 +3106,8 @@ static void execlists_reset(struct intel_engine_cs *engine, const char *msg)
>  	tasklet_disable_nosync(&engine->execlists.tasklet);
>  
>  	ring_set_paused(engine, 1); /* Freeze the current request in place */
> -	if (execlists_capture(engine))
> -		intel_engine_reset(engine, msg);
> -	else
> -		ring_set_paused(engine, 0);
> +	execlists_capture(engine);
> +	intel_engine_reset(engine, msg);
>  
>  	tasklet_enable(&engine->execlists.tasklet);
>  	clear_and_wake_up_bit(bit, lock);
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


More information about the Intel-gfx mailing list