[Intel-gfx] [PATCH] drm/i915: Avoid the gpu reset vs. modeset deadlock

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Thu Jul 27 12:13:48 UTC 2017


On 20/07/2017 21:23, Daniel Vetter wrote:
> ... using the biggest hammer we have. This is essentially a weaponized
> version of the timeout-based wedging Chris added in
> 
> commit 36703e79a982c8ce5a8e43833291f2719e92d0d1
> Author: Chris Wilson <chris at chris-wilson.co.uk>
> Date:   Thu Jun 22 11:56:25 2017 +0100
> 
>      drm/i915: Break modeset deadlocks on reset
> 
> Because defense-in-depth is good it's good to still have both. Also
> note that with the locking change we can now restrict this a lot (old
> gpus and special testing only), so this doesn't kill the TDR benefits
> on at least anything remotely modern.
> 
> And futuremore with a few tricks it should be possible to make a much
> more educated guess about whether an atomic commit is stuck waiting on
> the gpu (atomic_t counting the pending i915_sw_fence used by the
> atomic modeset code should do it), so we can improve this.
> 
> But for now just start with something that is guaranteed to recover
> faster, for much better CI througput.
> 
> This defacto reverts TDR on these platforms, but there's not really a
> single commit to specify as the sole offender.
> 
> v2: Add a debug message to explain what's going on. We can't DRM_ERROR
> because that spams CI. And the timeout based fallback still prints a
> DRM_ERROR, in case something goes wrong.
> 
> Fixes: 4680816be336 ("drm/i915: Wait first for submission, before waiting for request completion")
> Fixes: 221fe7994554 ("drm/i915: Perform a direct reset of the GPU from the waiter")
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> ---
>   drivers/gpu/drm/i915/intel_display.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 02b1f4966049..995522e40ec1 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -3471,6 +3471,12 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv)
>   	    !gpu_reset_clobbers_display(dev_priv))
>   		return;
>   
> +	/* We have a modeset vs reset deadlock, defensively unbreak it.
> +	 *
> +	 * FIXME: We can do a _lot_ better, this is just a first iteration.*/
> +	i915_gem_set_wedged(dev_priv);
> +	DRM_DEBUG_DRIVER("Wedging GPU to avoid deadlocks with pending modeset updates\n");
> +
>   	/*
>   	 * Need mode_config.mutex so that we don't
>   	 * trample ongoing ->detect() and whatnot.
> 

Looks fine to me since it only affects very old platforms, and is a 
temporary measure.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Regards,

Tvrtko


More information about the Intel-gfx mailing list