[Intel-gfx] [PATCH 2/2] drm/i915/bxt: Fix inadvertent CPU snooping due to incorrect MOCS config

Imre Deak imre.deak at intel.com
Mon Apr 25 13:01:13 UTC 2016


On ma, 2016-04-25 at 13:49 +0100, Chris Wilson wrote:
> On Mon, Apr 25, 2016 at 03:39:54PM +0300, Imre Deak wrote:
> > On ma, 2016-04-25 at 13:37 +0100, Chris Wilson wrote:
> > > On Mon, Apr 25, 2016 at 03:23:21PM +0300, Imre Deak wrote:
> > > > Setting a write-back cache policy in the MOCS entry definition
> > > > also
> > > > implies snooping, which has a considerable overhead. This is
> > > > unexpected for a few reasons:
> > > > - From user-space's point of view since it didn't want a
> > > > coherent
> > > >   surface (it didn't set the buffer as such via the set caching
> > > > IOCTL).
> > > > - There is a separate MOCS entry field for snooping (which we
> > > > never
> > > >   set).
> > > > - This MOCS table is about caching in (e)LLC and there is no
> > > > (e)LLC
> > > > on
> > > >   BXT. There is a separate table for L3 cache control.
> > > > 
> > > > Considering the above the current behavior of snooping looks
> > > > like
> > > > an
> > > > unintentional side-effect of the WB setting. Changing it to be
> > > > PTE
> > > > based cacheability gets rid of the snooping without any ill-
> > > > effects.
> > > > For a coherent surface the application would use a separate
> > > > MOCS
> > > > entry
> > > > (at index 1) and call the set caching IOCTL to setup the PTE
> > > > entries
> > > > for the corresponding buffer to be snooped.
> > > > 
> > > > This resulted in 70% improvement in synthetic texturing
> > > > benchmarks.
> > > > 
> > > > Kudos to Valtteri Rantala, Eero Tamminen and Michael T
> > > > Frederick
> > > > and
> > > > Ville who helped to narrow the source of problem to the kernel
> > > > and
> > > > to
> > > > the snooping behaviour in particular.
> > > > 
> > > > CC: Valtteri Rantala <valtteri.rantala at intel.com>
> > > > CC: Eero Tamminen <eero.t.tamminen at intel.com>
> > > > CC: Michael T Frederick <michael.t.frederick at intel.com>
> > > > CC: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > > Signed-off-by: Imre Deak <imre.deak at intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/intel_mocs.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_mocs.c
> > > > b/drivers/gpu/drm/i915/intel_mocs.c
> > > > index 5006a92..23c7dd1 100644
> > > > --- a/drivers/gpu/drm/i915/intel_mocs.c
> > > > +++ b/drivers/gpu/drm/i915/intel_mocs.c
> > > > @@ -142,7 +142,7 @@ static const struct drm_i915_mocs_entry
> > > > broxton_mocs_table[] = {
> > > >  	  .l3cc_value =    L3_ESC(0) | L3_SCC(0) |
> > > > L3_CACHEABILITY(L3_WB),
> > > >  	},
> > > >  	{
> > > > -	  .control_value = LE_CACHEABILITY(LE_WB) |
> > > > +	  .control_value = LE_CACHEABILITY(LE_PAGETABLE) |
> > > >  			   LE_TGT_CACHE(LE_TC_LLC_ELLC) |
> > > >  			   LE_LRUM(3) | LE_AOM(0) | LE_RSC(0)
> > > > |
> > > > LE_SCC(0) |
> > > >  			   LE_PFM(0) | LE_SCF(0),
> > > 
> > > This is index 2? This should *be* the snooping entry?
> > > 
> > > Index 0: uncached
> > > Index 1: follow pte
> > > Index 2: snoop
> > > 
> > > Aim I missing something? Why isn't this a userspce bug for
> > > requesting
> > > a
> > > mocs setting it didn't wnat? In my kernel this makes
> > > mocs_table[1] ==
> > > mocs_table[2].
> > 
> > I don't think there is or can be any snooping entry. On CHV for
> > example
> > we can only setup snooping via the PTE, so there we necessarily
> > have to
> > use entry 1 + PTE setup (set caching IOCTL). This is also what both
> > Windows and Mesa assumes apparently.
> 
> Then the issue is not we've enabled snooping via mocs, but that the
> mocs
> entry is completely bogus. Please update your commit message :)

Yes it is bogus, that is what I meant by "incorrect MOCS config" in the
subject. Tbh, the specification is really unclear what these bits
control, it doesn't mention at all for example that the WB setting
would affect snooping behavior. It doesn't either have much sense to me
to set any of these MOCS entry fields other than their default on BXT,
since they all (should) control only e(LLC) cacheability which is
irrelevant on BXT. I will ask for a clarification for this in BSpec.

--Imre


More information about the Intel-gfx mailing list