[Intel-gfx] [PATCH] drm/i915: Show if we consider the engine is idle in the GPU error state

Chris Wilson chris at chris-wilson.co.uk
Tue Dec 19 21:02:15 UTC 2017


Quoting Rodrigo Vivi (2017-12-19 20:49:54)
> On Tue, Dec 19, 2017 at 01:14:19PM +0000, Chris Wilson wrote:
> > Useful for verifying our bookkeeper when we encounter is knowing whether
> > we think the engine is idle at the time of the GPU hang.
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=104305
> 
> Here you mention the hang as "false positive"...
> if it is a false positive and we have this idle information
> shouldn't we handle this differently instead of trowing the error
> information and reseting the GPU?

I have contemplated skipping the reset if we think the GPU is idle, but
that does rather assume that we have perfect knowledge and that skipping
the reset is a good thing. (Though we do differentiate between resets to
restore hw state and resets to fix a GPU hang already, so maybe it's not
so bad, the caveat being an explicit request to reset the GPU.) In this
case, a cursory glance said the engine should be idle (RING_MODE has the
idle bit, RING_HEAD == RING_TAIL and the last seqno was completed) and I
wanted to confirm that the driver also thought the engine should have
been idle. That would leave the question as to why hangcheck thought
differently, i.e. I'm trying to narrow the cause to a particular piece of
code.
-Chris


More information about the Intel-gfx mailing list