[Intel-gfx] [PATCH] drm/i915: Sleep and retry a GPU reset if at first we don't succeed

Chris Wilson chris at chris-wilson.co.uk
Fri Dec 1 12:18:17 UTC 2017


Quoting Chris Wilson (2017-12-01 12:12:40)
> As we declare the GPU wedged if the reset fails, such a failure is quite
> terminal. Before taking that drastic action, let's sleep first and try
> active, in the hope that the hardware has quietened down and is then
> able to reset. After a few such attempts, it is fair to say that the HW
> is truly wedged.
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=104007
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index e0f053f9c186..924ebe24b282 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1877,7 +1877,9 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
>  {
>         struct i915_gpu_error *error = &i915->gpu_error;
>         int ret;
> +       int i;
>  
> +       might_sleep();
>         lockdep_assert_held(&i915->drm.struct_mutex);
>         GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &error->flags));
>  
> @@ -1900,12 +1902,20 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
>                 goto error;
>         }
>  
> -       ret = intel_gpu_reset(i915, ALL_ENGINES);
> -       if (ret) {
> -               if (ret != -ENODEV)
> -                       DRM_ERROR("Failed to reset chip: %i\n", ret);
> -               else
> -                       DRM_DEBUG_DRIVER("GPU reset disabled\n");
> +       if (!intel_has_gpu_reset(i915)) {
> +               DRM_DEBUG_DRIVER("GPU reset disabled\n");
> +               goto error;
> +       }
> +
> +       for (i = 0; i < 3; i++) {
> +               ret = intel_gpu_reset(i915, ALL_ENGINES);
> +               if (ret == 0)
> +                       break;
> +
> +               msleep(100);
> +       }
> +       if (ret != -ENODEV) {

Bah, was meant to be if (ret) {

> +               DRM_ERROR("Failed to reset chip: %i\n", ret);
>                 goto error;
>         }
>  
> -- 
> 2.15.1
> 


More information about the Intel-gfx mailing list