[Intel-gfx] [PATCH] drm/i915: Only try to stop engines after a failed reset

Chris Wilson chris at chris-wilson.co.uk
Thu Feb 14 11:39:42 UTC 2019


Quoting Mika Kuoppala (2019-02-14 11:29:13)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
> 
> > Currently we try to stop the engine by programming the ring registers to
> > be disabled before we perform the reset. Sometimes, we see the context
> > image also have invalid ring registers, which one presumes may be
> > actually caused by us doing so. Lets risk not doing programming the
> > ring to zero on the first attempt to avoid preserving that corruption
> > into the context image, leaving the w/a in place for subsequent
> > reset attempts.
> 
> Might lead to some failed attempts but the
> sentiment that use finesse instead of biggest hammer
> in arsenal is sane.

Yeah, the warning we have there is nasty, but fingers crossed this
balance works. So far gen9 seems happy, haven't dug out ctg yet.
Hopefully survives across the farm. I'm left again pondering if we can
increase the variety of hangs we induce.
 
> It makes me also ponder if we can fight against this
> in other side of the fence. Doing a precursory check,
> for debug builds, on (first) submit that the lrc is sane?
> Using the lrc_regs_ok()?

But we write those during first submit and before that the entire page
is invalid :)
-Chris


More information about the Intel-gfx mailing list