[Intel-gfx] [PATCH v2] drm/i915/selftests: Include the trace as a debug aide

Mika Kuoppala mika.kuoppala at linux.intel.com
Thu Mar 22 14:26:41 UTC 2018


Chris Wilson <chris at chris-wilson.co.uk> writes:

> If we fail to reset the GPU in a timely fashion, dump the GEM trace so
> that we can see what operations were in flight when the GPU got stuck.
>
> v2: There's more than one timeout that deserves tracing!
> v3: Silence checkpatch by not even using a product at all!
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 23 ++++++++++++++++++++---
>  1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> index 4372826998aa..9b235dae8dd9 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> @@ -260,8 +260,11 @@ static void wedge_me(struct work_struct *work)
>  {
>  	struct wedge_me *w = container_of(work, typeof(*w), work.work);
>  
> -	pr_err("%pS timed out, cancelling all further testing.\n",
> -	       w->symbol);
> +	pr_err("%pS timed out, cancelling all further testing.\n", w->symbol);
> +
> +	GEM_TRACE("%pS timed out.\n", w->symbol);
> +	GEM_TRACE_DUMP();
> +
>  	i915_gem_set_wedged(w->i915);
>  }
>  
> @@ -621,9 +624,19 @@ static int active_engine(void *data)
>  		mutex_unlock(&engine->i915->drm.struct_mutex);
>  
>  		if (old) {
> -			i915_request_wait(old, 0, MAX_SCHEDULE_TIMEOUT);
> +			if (i915_request_wait(old, 0, HZ) < 0) {
> +				GEM_TRACE("%s timed out.\n", engine->name);
> +				GEM_TRACE_DUMP();
> +
> +				i915_gem_set_wedged(engine->i915);
> +				i915_request_put(old);
> +				err = -EIO;
> +				break;
> +			}

Using err = i915_request_wait() could have saved one extra request_put
but I dunno if it would be any cleaner.

>  			i915_request_put(old);
>  		}
> +
> +		cond_resched();

To give more slack for other engines and main thread to proceed?

>  	}
>  
>  	for (count = 0; count < ARRAY_SIZE(rq); count++)
> @@ -1126,6 +1139,10 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915)
>  
>  	err = i915_subtests(tests, i915);
>  
> +	mutex_lock(&i915->drm.struct_mutex);
> +	flush_test(i915, I915_WAIT_LOCKED);
> +	mutex_unlock(&i915->drm.struct_mutex);
> +

To wash out leftovers.

Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>

>  	i915_modparams.enable_hangcheck = saved_hangcheck;
>  	intel_runtime_pm_put(i915);
>  
> -- 
> 2.16.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


More information about the Intel-gfx mailing list