[Intel-gfx] [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged

Mika Kuoppala mika.kuoppala at linux.intel.com
Mon Oct 16 14:24:33 UTC 2017


Chris Wilson <chris at chris-wilson.co.uk> writes:

> If we fail to recover the HW state upon resume (i.e. our attempt to
> clear the wedged bit and reset during i915_gem_sanitize() fails), then
> skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
> HW restart when successfully unwedging and resetting the HW later,
> but attempting to restore a wedged device upon resume is risky as the HW
> is in an unknown state.
>
> v2: Suppress the error message when detecting the already wedged HW.
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d9d39b309ce8..449f8c3788b1 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
>  	init_unused_rings(dev_priv);
>  
>  	BUG_ON(!dev_priv->kernel_context);
> +	if (i915_terminally_wedged(&dev_priv->gpu_error)) {
> +		ret = -EIO;
> +		goto out;
> +	}
>

You have done some hw initialization already before this point.
Is there a reason for not moving this right before acquiring
forcewake?

-Mika


>  	ret = i915_ppgtt_init_hw(dev_priv);
>  	if (ret) {
> @@ -4933,8 +4937,10 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
>  		 * wedged. But we only want to do this where the GPU is angry,
>  		 * for all other failure, such as an allocation failure, bail.
>  		 */
> -		DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
> -		i915_gem_set_wedged(dev_priv);
> +		if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
> +			DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
> +			i915_gem_set_wedged(dev_priv);
> +		}
>  		ret = 0;
>  	}
>  
> -- 
> 2.15.0.rc0


More information about the Intel-gfx mailing list