[Intel-gfx] [PATCH] drm/i915: Show if we consider the engine is idle in the GPU error state

Chris Wilson chris at chris-wilson.co.uk
Tue Dec 19 23:08:28 UTC 2017


Quoting Chris Wilson (2017-12-19 21:02:15)
> Quoting Rodrigo Vivi (2017-12-19 20:49:54)
> > On Tue, Dec 19, 2017 at 01:14:19PM +0000, Chris Wilson wrote:
> > > Useful for verifying our bookkeeper when we encounter is knowing whether
> > > we think the engine is idle at the time of the GPU hang.
> > > 
> > > References: https://bugs.freedesktop.org/show_bug.cgi?id=104305
> > 
> > Here you mention the hang as "false positive"...
> > if it is a false positive and we have this idle information
> > shouldn't we handle this differently instead of trowing the error
> > information and reseting the GPU?
> 
> I have contemplated skipping the reset if we think the GPU is idle, but
> that does rather assume that we have perfect knowledge and that skipping
> the reset is a good thing. (Though we do differentiate between resets to
> restore hw state and resets to fix a GPU hang already, so maybe it's not
> so bad, the caveat being an explicit request to reset the GPU.) In this
> case, a cursory glance said the engine should be idle (RING_MODE has the
> idle bit, RING_HEAD == RING_TAIL and the last seqno was completed) and I
> wanted to confirm that the driver also thought the engine should have
> been idle. That would leave the question as to why hangcheck thought
> differently, i.e. I'm trying to narrow the cause to a particular piece of
> code.

Thanks for the review, pushed and time to chase up the reporter to see
if he can reproduce on drm-tip.
-Chris


More information about the Intel-gfx mailing list