[Intel-gfx] [PATCH v2 2/3] drm/i915: Setup MCR steering for RCS engine workarounds

Tue May 12 23:28:45 UTC 2020

On Tue, May 12, 2020 at 10:30:31AM -0700, Daniele Ceraolo Spurio wrote:
> 
> 
> On 5/1/20 9:57 PM, Matt Roper wrote:
> > Reads of multicast registers give the value associated with
> > slice/subslice 0 by default unless we manually steer the reads to a
> > different slice/subslice.  If slice/subslice 0 are fused off in hardware,
> > performing unsteered reads of multicast registers will return a value of
> > 0 rather than the value we wrote into the multicast register.
> > 
> > To ensure we can properly readback and verify workarounds that touch
> > registers in a multicast range, we currently setup MCR steering to a
> > known-valid slice/subslice as the very first item in the GT workaround
> > list for gen10+.  That steering will then be in place as we verify the
> > rest of the registers that show up in the GT workaround list, and at
> > initialization the steering will also still be in effect when we move on
> > to applying and verifying the workarounds in the RCS engine's workaround
> > list (which is where most of the multicast registers actually show up).
> > 
> > However we seem run into problems during resets where RCS engine
> > workarounds are applied without being preceded by application of the GT
> > workaround list and the steering isn't in place.  Let's add the same MCR
> > steering to the beginning of the RCS engine's workaround list to ensure
> > that it's always in place and we don't get erroneous messages about RCS
> > engine workarounds failing to apply.
> > 
> 
> Did you check with the HW team about this behavior? AFAIU from reading the
> specs, the 0xfdc register should be nowhere close the render domain and
> therefore it should not be impacted by RCS reset. If it really is impacted,
> it might indicate that something funny is happening on the HW and there
> might be other register in the area that are also incorrectly losing their
> value, so IMO worth investigating.

The theory is that it might not be the RCS reset itself that causes us
to lose 0xFDC but rather something else that happened between the last
time we applied the steering (e.g., at init) and when we do the reset.

We've asked the hardware guys about this register not sticking, but
really the only thing we've heard back so far is "try checking the
steering."  Presumably steering is good at startup since the workaround
sticks at that point, but then gets changed or cleared later by the time
we do the reset.  Unfortunately none of our pre-merge CI machines or
local machines exhibit this problem, only the CI "re-run" machine, so we
don't have a good way to investigate this in a more direct manner.

Would re-applying the steering in the engine workaround list cause any
problems?  It seems like in the worst case it would do nothing at all
and the problem would persist, but at least it would protect us in case
some other unknown event causes 0xfdc to be reset, and the eventual
results we get from CI re-runs would let us know if we're on the right
track or not.

Matt

> 
> Daniele
> 
> > References: https://gitlab.freedesktop.org/drm/intel/issues/1222
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> > Cc: chris at chris-wilson.co.uk
> > Signed-off-by: Matt Roper <matthew.d.roper at intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_workarounds.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 4a255de13394..b11b83546696 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -1345,6 +1345,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
> >   {
> >   	struct drm_i915_private *i915 = engine->i915;
> > +	if (INTEL_GEN(i915) >= 10)
> > +		wa_init_mcr(i915, wal);
> > +
> >   	if (IS_TGL_REVID(i915, TGL_REVID_A0, TGL_REVID_A0)) {
> >   		/*
> >   		 * Wa_1607138336:tgl
> > 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795