[Intel-gfx] [PATCH] drm/i915: Upgrade execbuffer fail after resume failure to EIO

Tue Mar 25 22:33:27 CET 2014

On Tue, Mar 25, 2014 at 05:52:00PM +0100, Daniel Vetter wrote:
> On Tue, Mar 25, 2014 at 5:29 PM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> >> Yeah I've seen the other patches. I think we should try to keep all the
> >> ring structures around even when the hw init failed. I've made some feeble
> >> attempts a while ago to split the structure init from the hw init stuff,
> >> but kinda never fully materialized ...
> >>
> >> Imo if our set of valid rings semi-randomly changes at runtime even,
> >> that's not good.
> >
> > Agreed, but sadly we can't trust hardware to always work, and we need
> > something to prevent explosions. I quite like the idea of marking the
> > GPU wedged if hw init fails so that we lose acceleration but keep
> > modesetting around.
> 
> Yeah, I agree that the  other two patches are neat indeed, it's this
> one here where the shiny starts to come off a bit ;-) tbh I'd prefer a
> simply if (terminally_wedged) return -EIO; here before the ring
> checks, maybe with a comment stating why we need to have this order.

It's ok, it is only to prevent UXA from going off the rails after the
odd resume hang on g45...

> That, or fix the mess called ring init code ...

So if we fixed resume to avoid reallocating the ringbuffers across
resume, g45 would still fail to restart, but now we still have valid
objects (or would we tear them down because of the failure?) and so this
check passes and we later hit the EIO checks?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre