[Intel-gfx] [PATCH v5] drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9

Joonas Lahtinen joonas.lahtinen at linux.intel.com
Thu Mar 11 11:11:48 UTC 2021


Quoting Tvrtko Ursulin (2021-03-11 12:45:54)
> 
> On 05/03/2021 12:58, Cooper Chiou wrote:
> > WaProgramMgsrForCorrectSliceSpecificMmioReads applies for Gen9 to
> > resolve VP8 hardware encoding system hang up on GT1 sku for
> > ChromiumOS projects
> > 
> > Slice specific MMIO read inaccurate so MGSR needs to be programmed
> > appropriately to get correct reads from these slicet-related MMIOs.
> > 
> > It dictates that before any MMIO read into Slice/Subslice specific
> > registers, MCR packet control register(0xFDC) needs to be programmed
> > to point to any enabled slice/subslice pair, especially GT1 fused sku
> > since this issue can be reproduced on VP8 hardware encoding via ffmpeg
> > on ChromiumOS devices.
> > When exit PC7, MGSR will reset so that we have to skip fused subslice ID.
> > 
> > Reference: HSD#1508045018,1405586840, BSID#0575
> > 
> > Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > Cc: Jani Nikula <jani.nikula at intel.com>
> > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
> > Cc: William Tseng <william.tseng at intel.com>
> > Cc: Lee Shawn C <shawn.c.lee at intel.com>
> > 
> > Signed-off-by: Cooper Chiou <cooper.chiou at intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_workarounds.c | 37 +++++++++++++++++++++
> >   1 file changed, 37 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 3b4a7da60f0b..eb2a587b06b8 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -878,9 +878,46 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
> >       wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME);
> >   }
> >   
> > +static void
> > +gen9_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
> > +{
> > +     const struct sseu_dev_info *sseu = &i915->gt.info.sseu;
> > +     unsigned int slice, subslice;
> > +     u32 mcr, mcr_mask;
> > +
> > +     GEM_BUG_ON(INTEL_GEN(i915) < 9);
> > +
> > +     /*
> > +      * WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml
> > +      * Before any MMIO read into slice/subslice specific registers, MCR
> > +      * packet control register needs to be programmed to point to any
> > +      * enabled s/ss pair. Otherwise, incorrect values will be returned.
> > +      * This means each subsequent MMIO read will be forwarded to an
> > +      * specific s/ss combination, but this is OK since these registers
> > +      * are consistent across s/ss in almost all cases. In the rare
> > +      * occasions, such as INSTDONE, where this value is dependent
> > +      * on s/ss combo, the read should be done with read_subslice_reg.
> > +      */
> > +     slice = ffs(sseu->slice_mask) - 1;
> > +     GEM_BUG_ON(slice >= ARRAY_SIZE(sseu->subslice_mask));
> > +     subslice = ffs(intel_sseu_get_subslices(sseu, slice));
> > +     GEM_BUG_ON(!subslice);
> > +     subslice--;
> > +
> > +     mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
> > +     mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
> > +
> > +     drm_dbg(&i915->drm, "MCR slice:%d/subslice:%d = %x\n", slice, subslice, mcr);
> > +
> > +     wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
> > +}
> > +
> >   static void
> >   gen9_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
> >   {
> > +     /* WaProgramMgsrForCorrectSliceSpecificMmioReads:glk,kbl,cml,gen9 */
> > +     gen9_wa_init_mcr(i915, wal);
> > +
> >       /* WaDisableKillLogic:bxt,skl,kbl */
> >       if (!IS_COFFEELAKE(i915) && !IS_COMETLAKE(i915))
> >               wa_write_or(wal,
> > 
> 
> 1)
> Patch mechanics are fine.
> 
> 2)
> We have confirmation from the HW folks this actually needs doing on Gen9 
> even if docs fail to mention it.
> 
> So even if the immediate fix is for VP8 encode, which is not fully open, 
> this is the right thing to do in general and would have been done if the 
> WA was properly documented from the start.
> 
> 3)
> 3d performance regression cannot be reproduced on the machine where it 
> was originally reported. (Or on other machines.)
> 
> So:
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> + Joonas for ack to merge due the second point above.

If this does not effect any fully Open Source userspace, this needs to be
carried downstream in the Chrome OS kernel tree.

Gen9 has been out there without this W/A for a long time. There is always
potential for changing existing deployments' behaviour to the worse when
adding W/As. If it had been implemented from the very beginning, then it
would have undergone all the testing not to interfere with existing
workloads. Merging it after the fact makes the risk much higher.

It's an unnecessary risk of regressions to merge a W/A with potential
for regressions that gains nothing for the upstream driver perspective.

So in short, unless the VP8 encoding can be Open Sourced or lack of the W/A
impacts otherwise the fully open stack, there is no path forward to merge
this due to the DRM subsystem userspace requirement rules:

https://www.kernel.org/doc/html/latest/gpu/drm-uapi.html#open-source-userspace-requirements

Regards, Joonas

> 
> Regards,
> 
> Tvrtko
> 
> P.S. Many thanks for patiently dealing with requests to test on many 
> platforms.
> 
> P.P.S. Sadly we are still not able to explain the whole details around 
> 0xfdc behaviour on Gen9 vs Gen11.


More information about the Intel-gfx mailing list