[Intel-gfx] [PATCH] drm/i915: Reset request handling for gen8+

Fri Jun 19 09:30:45 PDT 2015

On Thu, Jun 18, 2015 at 04:58:06PM +0200, Daniel Vetter wrote:
> On Thu, Jun 18, 2015 at 12:42:55PM +0100, Chris Wilson wrote:
> > I understand the merit in trying the reset a few times before giving up,
> > it would just need a bit of restructuring to try the reset before
> > clearing gem state (trivial) and requeueing the hangcheck. I am just
> > wary of feature creep before we get stuck into TDR, which promises to
> > change how we think about resets entirely.
> 
> My maintainer concern here is always that we should err on the side of not
> killing the machine. If the reset failed, or if the gpu reinit failed then
> marking the gpu as wedged has historically been the safe option. The
> system will still run, display mostly works and there's a reasonable
> chance you can gather debug data.

One thing to bear in mind here is that it with this particular don't
reset if not ready logic, repeating the attempt at reset after another
hangcheck is equivalent to just using a slower hangcheck. (more or less,
a couple of writes to one register difference) So it is no more likely
to hang the machine than the original GPU hang.

We can differentiate the cases here, between say EBUSY, ENODEV, and EIO,
from the actual the reset request to determine which we want to retry
(i.e. EBUSY).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre