[Intel-gfx] [PATCH v4] drm/i915/lrc: Scrub the GPU state of the guilty hanging request

Chris Wilson chris at chris-wilson.co.uk
Mon Apr 30 15:53:04 UTC 2018


Quoting Michel Thierry (2018-04-30 16:49:53)
> On 04/28/2018 04:15 AM, Chris Wilson wrote:
> > Previously, we just reset the ring register in the context image such
> > that we could skip over the broken batch and emit the closing
> > breadcrumb. However, on resume the context image and GPU state would be
> > reloaded, which may have been left in an inconsistent state by the
> > reset. The presumption was that at worst it would just cause another
> > reset and skip again until it recovered, however it seems just as likely
> > to cause an unrecoverable hang. Instead of risking loading an incomplete
> > context image, restore it back to the default state.
> > 
> > v2: Fix up off-by-one from including the ppHSWP in with the register
> > state.
> > v3: Use a ring local to compact a few lines.
> > v4: Beware setting the ring local before checking for a NULL request.
> 
> Didn't you want to set the ring local after this check?
>         if (!request || request->fence.error != -EIO)

I just removed adding the ring local. Fewer changes...
-Chris


More information about the Intel-gfx mailing list