[Intel-gfx] [PATCH 2/6] drm/i915: Decouple hang detection from hangcheck period

Chris Wilson chris at chris-wilson.co.uk
Tue Nov 15 16:38:56 UTC 2016


On Tue, Nov 15, 2016 at 04:36:32PM +0200, Mika Kuoppala wrote:
> +static bool
> +hangcheck_engine_stall(struct intel_engine_cs *engine,
> +		       struct intel_engine_hangcheck *hc)
> +{
> +	const unsigned long last_action = engine->hangcheck.action_timestamp;
>  
> -		memset(&engine->hangcheck.instdone, 0,
> -		       sizeof(engine->hangcheck.instdone));
> -		break;
> +	if (hc->action == HANGCHECK_ACTIVE_SEQNO ||
> +	    hc->action == HANGCHECK_IDLE)
> +		return false;
> +
> +	if (time_before(jiffies, last_action + HANGCHECK_HUNG_JIFFIES))
> +		return false;
> +
> +	if (time_before(jiffies, last_action + HANGCHECK_STUCK_JIFFIES))
> +		if (hc->action != HANGCHECK_HUNG)
> +			return false;

This deserves a lot more explanation. Why two values? What are their
significance?

Do we want to be using jiffies? (Values are in seconds, so jiffie
resoluation makes sense, but add that as a comment somewhere.)

> +	return true;
> +}
> +
> +static struct intel_engine_cs *find_lra_engine(struct drm_i915_private *i915,
> +					       const unsigned int mask)

What is lra?

> +{
> +	struct intel_engine_cs *engine = NULL, *c;
> +	enum intel_engine_id id;
> +
> +	for_each_engine_masked(c, i915, mask, id) {
> +		if (engine == NULL) {
> +			engine = c;
> +			continue;
> +		}
> +
> +		if (time_before(c->hangcheck.action_timestamp,
> +				engine->hangcheck.action_timestamp))
> +			engine = c;
> +		else if (c->hangcheck.action_timestamp ==
> +			 engine->hangcheck.action_timestamp &&
> +			 c->hangcheck.seqno < engine->hangcheck.seqno)
> +			engine = c;

Why is the last engine significant?
Why is just engine guilty?


Too many whys in this one.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list