[Intel-gfx] [PATCH v2 12/22] drm/i915: Replace wait-on-mutex with wait-on-bit in reset worker

Thu Sep 8 10:52:21 UTC 2016

Chris Wilson <chris at chris-wilson.co.uk> writes:

> Since we have a cooperative mode now with a direct reset, we can avoid
> the contention on struct_mutex and instead try then sleep on the
> I915_RESET_IN_PROGRESS bit. If the mutex is held and that bit is
> cleared, all is fine. Otherwise, we sleep for a bit and try again. In
> the worst case we sleep for an extra second waiting for the mutex to be
> released (no one touching the GPU is allowed the struct_mutex whilst the
> I915_RESET_IN_PROGRESS bit is set). But when we have a direct reset,
> this allows us to clean up the reset worker faster.
>
> v2: Remember to call wake_up_bit() after changing (for the faster wakeup
> as promised)
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_drv.c |  6 ++++--
>  drivers/gpu/drm/i915/i915_irq.c | 30 ++++++++++++++++++------------
>  2 files changed, 22 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index ff4173e6e298..c1b890dbd6cc 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1798,11 +1798,13 @@ int i915_reset(struct drm_i915_private *dev_priv)
>  	intel_sanitize_gt_powersave(dev_priv);
>  	intel_autoenable_gt_powersave(dev_priv);
>  
> -	return 0;
> +out:
> +	wake_up_bit(&error->flags, I915_RESET_IN_PROGRESS);
> +	return ret;
>  
>  error:
>  	set_bit(I915_WEDGED, &error->flags);
> -	return ret;
> +	goto out;

After initial surprice, and surprices are negative,
this is short, keeps the error stuff out from bulkcode
and keeps wakeup neatly in both paths.

To reduce the surprice effect, label could be wakeup_out.

Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>

>  }
>  
>  static int i915_pm_suspend(struct device *kdev)
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 2c7cb5041511..699ee2c7a3e4 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -2497,7 +2497,7 @@ static void i915_reset_and_wakeup(struct drm_i915_private *dev_priv)
>  	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
>  	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
>  	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
> -	int ret;
> +	int ret = -EAGAIN;
>  
>  	kobject_uevent_env(kobj, KOBJ_CHANGE, error_event);
>  
> @@ -2512,21 +2512,27 @@ static void i915_reset_and_wakeup(struct drm_i915_private *dev_priv)
>  	 * simulated reset via debugs, so get an RPM reference.
>  	 */
>  	intel_runtime_pm_get(dev_priv);
> -
>  	intel_prepare_reset(dev_priv);
>  
> -	/*
> -	 * All state reset _must_ be completed before we update the
> -	 * reset counter, for otherwise waiters might miss the reset
> -	 * pending state and not properly drop locks, resulting in
> -	 * deadlocks with the reset work.
> -	 */
> -	mutex_lock(&dev_priv->drm.struct_mutex);
> -	ret = i915_reset(dev_priv);
> -	mutex_unlock(&dev_priv->drm.struct_mutex);
> +	do {
> +		/*
> +		 * All state reset _must_ be completed before we update the
> +		 * reset counter, for otherwise waiters might miss the reset
> +		 * pending state and not properly drop locks, resulting in
> +		 * deadlocks with the reset work.
> +		 */
> +		if (mutex_trylock(&dev_priv->drm.struct_mutex)) {
> +			ret = i915_reset(dev_priv);
> +			mutex_unlock(&dev_priv->drm.struct_mutex);
> +		}
>  
> -	intel_finish_reset(dev_priv);
> +		/* We need to wait for anyone holding the lock to wakeup */
> +	} while (wait_on_bit_timeout(&dev_priv->gpu_error.flags,
> +				     I915_RESET_IN_PROGRESS,
> +				     TASK_UNINTERRUPTIBLE,
> +				     HZ));
>  
> +	intel_finish_reset(dev_priv);
>  	intel_runtime_pm_put(dev_priv);
>  
>  	if (ret == 0)
> -- 
> 2.9.3