[Intel-gfx] [PATCH v3] drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9

Chris Wilson chris at chris-wilson.co.uk
Wed Mar 10 10:54:55 UTC 2021


Quoting Tvrtko Ursulin (2021-03-10 10:19:12)
> 
> Hi,
> 
> On 08/03/2021 17:32, Chiou, Cooper wrote:
> > I've tested on GLK, KBL, CFL Intel NUC devices and got the following performance results, there is no performance regression per my testing.
> > 
> > Patch: [v5] drm/i915: Enable WaProgramMgsrForCorrectSliceSpecificMmioReads for Gen9
> > Test suite: phoronix-test-suite.supertuxkart.1024x768.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second
> > Kernel version: 5.12.0-rc1 (drm-tip)
> > 
> > a. Device: Intel NUC kit NUC7JY Gemini Lake Celeron J4005 @2.7GHz (2 Cores)
> >      Without patch, fps=57.45
> >      With patch, fps=57.49
> > b. Device: Intel NUC kit NUC8BEH Coffee Lake Core i3-8109U @3.6GHz(4 Cores)
> >      Without patch, fps=117.23
> >      With patch, fps=117.27
> > c. Device: Intel NUC kit NUC7i3BNH Kaby Lake Core i3-7100U @2.4GHz(4 Cores)
> >      Without patch, fps=114.05
> >      With patch, fps=114.34
> > 
> > Meanwhile, Intel lkp team has validated performance on lkp-kbl-nuc1 and no regression.
> > f69d02e37a85645a  d912096c40cdc3bc9364966971 testcase/testparams/testbox
> > ----------------  -------------------------- ---------------------------
> >            %stddev      change         %stddev
> >                \          |                \
> >        29.79                       29.67
> > phoronix-test-suite/performance-true-Fullscreen-Ultimate-1-Gran_Paradiso_Island__Approxima-supertuxkart-1.5.2-ucode=0xde/lkp-kbl-nuc1
> >        29.79                       29.67        GEO-MEAN phoronix-test-suite.supertuxkart.1280x1024.Fullscreen.Ultimate.1.GranParadisoIsland.frames_per_second
> > 
> 
> CI results are green so that is good.
> 
> Do the machines used for performance testing include unusual fusing? 
> Worrying thing is that we were never able to reproduce the reported 
> regression in house due lack of identical machine, right? Although I 
> guess avoiding hangs trumps performance.

The issue is that if the regression is reproducible it means that the
broadcast mask is no longer correct (or never was, one or the other ;)
And another w/a is going astray because it depends on the previous
undefined value of the mcr.

Which raises the question as to whether the hang prevention seen here is
also because some other w/a (or other mmio) is not being applied to the
relevant units. Or vice versa.

Either way there remains an underlying issue in that some register
writes for gen9 require mcr being set that were are not handling
correctly. Changing the mask here changing results elsewhere indicate
that the issues are fully addressed, and the fear that undoing some
other mmio is going to introduce other subtle hangs. And we are all
blindly poking at the issue as no one has access to the affected skus.

What would be useful is if we print the value before changing it so that
we can see if we have any machines in CI where we are making significant
changes to the broadcast mask.
-Chris


More information about the Intel-gfx mailing list