[PATCH 3/9] drm/i915: Avoid the gpu reset vs. modeset deadlock

Chris Wilson chris at chris-wilson.co.uk
Wed Jul 19 13:32:46 UTC 2017


Quoting Daniel Vetter (2017-07-19 13:54:56)
> ... using the biggest hammer we have. This is essentially a weaponized
> version of the timeout-based wedging Chris added in
> 
> commit 36703e79a982c8ce5a8e43833291f2719e92d0d1
> Author: Chris Wilson <chris at chris-wilson.co.uk>
> Date:   Thu Jun 22 11:56:25 2017 +0100
> 
>     drm/i915: Break modeset deadlocks on reset
> 
> Because defense-in-depth is good it's good to still have both. Also
> note that with the locking change we can now restrict this a lot (old
> gpus and special testing only), so this doesn't kill the TDR benefits
> on at least anything remotely modern.
> 
> And futuremore with a few tricks it should be possible to make a much
> more educated guess about whether an atomic commit is stuck waiting on
> the gpu (atomic_t counting the pending i915_sw_fence used by the
> atomic modeset code should do it), so we can improve this.
> 
> But for now just start with something that is guaranteed to recover
> faster, for much better CI througput.
> 
> This defacto reverts TDR on these platforms, but there's not really a
> single commit to specify as the sole offender.
> 
> Fixes: 4680816be336 ("drm/i915: Wait first for submission, before waiting for request completion")
> Fixes: 221fe7994554 ("drm/i915: Perform a direct reset of the GPU from the waiter")
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 97777ffa1566..010a1f3e000c 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -3471,6 +3471,11 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv)
>             !gpu_reset_clobbers_display(dev_priv))
>                 return;
>  
> +       /* We have a modeset vs reset deadlock, defensively unbreak it.
> +        *
> +        * FIXME: We can do a _lot_ better, this is just a first iteration.*/

You should keep the error message for wedging the device. If all goes
well it is removed in a few patches, so shouldn't be an eyesore for
long.
-Chris


More information about the dri-devel mailing list