[Intel-gfx] [PATCH] drm/i915: Inspect subunit states on hangcheck

Mika Kuoppala mika.kuoppala at linux.intel.com
Fri Jan 8 06:54:19 PST 2016


Chris Wilson <chris at chris-wilson.co.uk> writes:

> On Tue, Dec 01, 2015 at 05:56:12PM +0200, Mika Kuoppala wrote:
>> If head seems stuck and engine in question is rcs,
>> inspect subunit state transitions from undone to done,
>> before deciding that this really is a hang instead of limited
>> progress. Only account the transitions of subunits from
>> undone to done once, to prevent unstable subunit states
>> to keep us falsely active.
>> 
>> As this adds one extra steps to hangcheck heuristics,
>> before hang is declared, it adds 1500ms to to detect hang
>> for render ring to a total of 7500ms. We could sample
>> the subunit states on first head stuck condition but
>> decide not to do so only in order to mimic old behaviour. This
>> way the check order of promotion from seqno > atchd > instdone
>> is consistently done.
>> 
>> v2: Deal with unstable done states (Arun)
>>     Clear instdone progress on head and seqno movement (Chris)
>>     Report raw and accumulated instdone's in in debugfs (Chris)
>>     Return HANGCHECK_ACTIVE on undone->done
>> 
>> References: https://bugs.freedesktop.org/show_bug.cgi?id=93029
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Dave Gordon <david.s.gordon at intel.com>
>> Cc: Daniel Vetter <daniel at ffwll.ch>
>> Cc: Arun Siluvery <arun.siluvery at linux.intel.com>
>> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
>
> I feel slightly dubious in discarding the 1->0 transitions (as it just
> means that a shared function that was previously idle is now in use
> again), but if they truly do fluctuate randomly? then accumulating
> should mean we eventually escape.
>
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>

Queued for -next, thanks for the review. 

-Mika


More information about the Intel-gfx mailing list