[Intel-gfx] [PATCH] drm/i915: Try harder to reset the gen8+ engines

Mika Kuoppala mika.kuoppala at linux.intel.com
Tue Sep 6 11:11:47 UTC 2016


Resending my r-b...

Chris Wilson <chris at chris-wilson.co.uk> writes:

> If at first we don't succeed, try again.
>
> Running the reset and recovery routines in a loop ends in a "reset
> request timeout" with a mtbf of an hour on Braswell. This is eerily
> similar to the unrecoverable reset condition that first prompted us to
> use the reset-request mechanism in commit 7fd2d26921d1 ("drm/i915: Reset
> request handling for gen8+"). Repeating the reset request makes the
> failure much harder to reproduce (but there is no reason to believe that
> it is more than mere paper over a timing or other issue).
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> Cc: Arun Siluvery <arun.siluvery at linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> Cc: stable at vger.kernel.org
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index e9f68cd56e32..1be8ced03ba5 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -1688,20 +1688,22 @@ int intel_wait_for_register(struct drm_i915_private *dev_priv,
>  static int gen8_request_engine_reset(struct intel_engine_cs *engine)
>  {
>  	struct drm_i915_private *dev_priv = engine->i915;
> -	int ret;
> +	int loop = 3;
>

retries?

> -	I915_WRITE_FW(RING_RESET_CTL(engine->mmio_base),
> -		      _MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET));
> +	do {
> +		I915_WRITE_FW(RING_RESET_CTL(engine->mmio_base),
> +			      _MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET));
>  
> -	ret = intel_wait_for_register_fw(dev_priv,
> -					 RING_RESET_CTL(engine->mmio_base),
> -					 RESET_CTL_READY_TO_RESET,
> -					 RESET_CTL_READY_TO_RESET,
> -					 700);
> -	if (ret)
> -		DRM_ERROR("%s: reset request timeout\n", engine->name);
> +		if (!intel_wait_for_register_fw(dev_priv,
> +						RING_RESET_CTL(engine->mmio_base),
> +						RESET_CTL_READY_TO_RESET,
> +						RESET_CTL_READY_TO_RESET,
> +						700))


Did you check between the write didn't stick vs the readyness didn't
signal?

With gen8, we might get away with just resetting regardless of the
the ready state. Needs some experimenting/testing first tho.

Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>

> +			return 0;
> +	} while (--loop);
>  
> -	return ret;
> +	DRM_ERROR("%s: reset request timeout\n", engine->name);
> +	return -EIO;
>  }
>  
>  static void gen8_unrequest_engine_reset(struct intel_engine_cs *engine)
> -- 
> 2.9.3


More information about the Intel-gfx mailing list