[Intel-gfx] [PATCH 2/6] drm/i915: Decouple hang detection from hangcheck period

Chris Wilson chris at chris-wilson.co.uk
Wed Nov 16 17:05:44 UTC 2016


On Wed, Nov 16, 2016 at 05:20:30PM +0200, Mika Kuoppala wrote:
> -	ring_hung = engine->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG;
> -	if (engine->hangcheck.seqno != intel_engine_get_seqno(engine))
> +	ring_hung = engine->hangcheck.stall;
> +	if (engine->hangcheck.seqno != intel_engine_get_seqno(engine)) {
> +		if (ring_hung)
> +			DRM_ERROR("%s pardoned due to progress after hangcheck %x vs %x\n",
> +				  engine->name,
> +				  engine->hangcheck.seqno,
> +				  intel_engine_get_seqno(engine));
> +

Is this worth alarming the user over? We recover gracefully either way.

DRM_DEBUG_DRIVER("pardoned, was guilty? %s\n", yesno(ring_hung));

>  		ring_hung = false;
> +	}
>  
>  	i915_set_reset_status(request->ctx, ring_hung);
>  	if (!ring_hung)
> diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.c b/drivers/gpu/drm/i915/i915_gem_timeline.c
> index bf8a471..348b0f2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_timeline.c
> +++ b/drivers/gpu/drm/i915/i915_gem_timeline.c
> @@ -24,6 +24,15 @@
>  
>  #include "i915_drv.h"
>  
> +static void i915_gem_timeline_retire(struct i915_gem_active *active,
> +				     struct drm_i915_gem_request *request)
> +{
> +	struct intel_timeline *tl =
> +		container_of(active, typeof(*tl), last_request);
> +
> +	tl->last_retire_timestamp = jiffies;

Make this a separate patch. last_retired_timestamp to match
last_submitted_seqno. Maybe worth adding last_submitted_timestamp as
well for debug purposes.

I guess we are missing a debugfs/i915_gem_timelines

Some grumbling, but nothing stands out in this patch.

So with the couple of points above,
Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list