[Intel-gfx] [RFC PATCH] drm/i915/debugfs: Only wedge if we have reset available

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Oct 2 15:45:18 UTC 2019


On 02/10/2019 13:48, Janusz Krzysztofik wrote:
> If we process DROP_RESET_ACTIVE and cancel all outstanding requests by
> forcing a GPU reset on a hardware with reset capabilities disabled or
> not supported, we certainly end up with a terminally wedged GPU,
> impossible to recover.  That's probably not what we want.

I forgot the whole background story here I'm afraid. Is the concern here 
the IGT exit handler calling DROP_RESET_ACTIVE? If so with this patch it 
will fail with -EBUSY, which could be fine, but what happens from the 
perspective of next test which gets to run? It won't find a wedged GPU, 
but will encounter a possibly nondeterministic amount of GPU work 
scheduled before it, no?

Regards,

Tvrtko

> Before setting the GPU wedged, verify if we have GPU reset available
> and fail with -EBUSY if not.
> 
> Suggested-by: Petri Latvala <petri.latvala at intel.com>
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
> Cc: Michał Wajdeczko <michal.wajdeczko at intel.com>
> Cc: Michał Winiarski <michal.winiarski at intel.com>
> Cc: Piotr Piórkowski <piotr.piorkowski at intel.com>
> Cc: Tomasz Lis <tomasz.lis at intel.com>
> Cc: Petri Latvala <petri.latvala at intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Martin Peres <martin.peres at linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c | 11 ++++++++++-
>   1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index fec9fb7cc384..0774ca6e2a05 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3627,8 +3627,17 @@ i915_drop_caches_set(void *data, u64 val)
>   
>   	if (val & DROP_RESET_ACTIVE &&
>   	    wait_for(intel_engines_are_idle(&i915->gt),
> -		     I915_IDLE_ENGINES_TIMEOUT))
> +		     I915_IDLE_ENGINES_TIMEOUT)) {
> +		/*
> +		 * Only wedge if reset is supported and not disabled, otherwise
> +		 * we certainly end up with the GPU terminally wedged.  Inform
> +		 * userspace about the problem instead.
> +		 */
> +		if (!intel_has_gpu_reset(&i915->gt))
> +			return -EBUSY;
> +
>   		intel_gt_set_wedged(&i915->gt);
> +	}
>   
>   	/* No need to check and wait for gpu resets, only libdrm auto-restarts
>   	 * on ioctls on -EAGAIN. */
> 


More information about the Intel-gfx mailing list