[PATCH 1/2] drm/i915: Restore GT coarse power gating workaround

Imre Deak imre.deak at intel.com
Fri Dec 20 17:05:21 UTC 2019


On Fri, Dec 20, 2019 at 12:38:53PM +0000, Chris Wilson wrote:
> Quoting Imre Deak (2019-12-20 12:29:13)
> > On Thu, Dec 19, 2019 at 04:42:44PM +0000, Chris Wilson wrote:
> > > Quoting Imre Deak (2019-11-14 16:42:24)
> > > > The workaround to disable coarse power gating is still needed on SKL
> > > > GT3/GT4 machines and since the RC6 context corruption was discovered by
> > > > the hardware team also on all GEN9 machines. Restore applying the
> > > > workaround.
> > > 
> > > What exactly is the link between powergating and the rc6 power context
> > > corruption? Disabling powergating entirely is quite a significant
> > > regression -- and we can't partially enable powergating for idle engines
> > > as the HW refuses to cooperate.
> > > 
> > > So is it safe to enable powergating if only vcs is busy [rcs is idle]?
> > 
> > The problem with CPG wrt. possible RC6 corruption is that if a
> > corruption happens on one engine while another is already idle (having
> > saved its context) then this idle engine won't be able to resume any
> > more (since it cannot restore its context from the corrupted RC6 state).
> 
> Ah, but we always switch to a scratch kernel context on idle, and
> that context has CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT set. Or it refers
> to the power context instead, which is beyond our control?

I admit I haven't consider restore inhibit at all, and also not sure if
it alone prevents accessing the power context. Would be great to have a
testcase for this.

IIUC you want to restore power saving for all idle engines when one (or
more) is still active. If that is what you meant:

What happens when you switch to the scratch kernel context, does that
also prevent writing to the HW context? There is also a save inhibit
flag, but we don't seem to use it and again I'm not sure if that
prevents accessing the power context.

If we can't prevent accessing the power context during engine idling
then I see a further problem with CPG: one engine active and another one
is just about to go idle. Since the first engine's command stream is
completely asynchronous with the second's idling sequence I can't see
any robust way for checking in the idling sequence if the first engine
manages to corrupt the RC6 context before the context image would be
written for the idling engine.

--Imre


More information about the Intel-gfx-trybot mailing list