[Intel-gfx] [PATCH v2 2/2] drm/i915/bxt: Fix inadvertent CPU snooping due to incorrect MOCS config

Ville Syrjälä ville.syrjala at linux.intel.com
Thu Apr 28 17:21:28 UTC 2016


On Thu, Apr 28, 2016 at 04:44:00PM +0200, Daniel Vetter wrote:
> On Thu, Apr 28, 2016 at 01:48:27PM +0300, Ville Syrjälä wrote:
> > On Thu, Apr 28, 2016 at 10:13:37AM +0200, Daniel Vetter wrote:
> > > On Tue, Apr 26, 2016 at 08:57:51PM +0300, Ville Syrjälä wrote:
> > > > On Tue, Apr 26, 2016 at 04:30:05PM +0200, Daniel Vetter wrote:
> > > > > On Tue, Apr 26, 2016 at 05:26:43PM +0300, Eero Tamminen wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > On 26.04.2016 16:23, Chris Wilson wrote:
> > > > > > >On Tue, Apr 26, 2016 at 04:17:55PM +0300, Imre Deak wrote:
> > > > > > >>On ti, 2016-04-26 at 13:57 +0100, Chris Wilson wrote:
> > > > > > >>>On Tue, Apr 26, 2016 at 03:44:22PM +0300, Imre Deak wrote:
> > > > > > >>>>Setting a write-back cache policy in the MOCS entry definition also
> > > > > > >>>>implies snooping, which has a considerable overhead. This is
> > > > > > >>>>unexpected for a few reasons:
> > > > > > >>>
> > > > > > >>>If it is snooping, then I don't see why it is undesirable to have it
> > > > > > >>>available in a mocs setting. If it is bogus and the bit is undefined,
> > > > > > >>>then by all means remove it.
> > > > > > >>
> > > > > > >>None of these entries are used alone for coherent surfaces. For that
> > > > > > >>the application would have to use entry index#1 or #2 _and_ call the
> > > > > > >>set caching IOCTL to set the corresponding buffer to be cached.
> > > > > > >
> > > > > > >No, the application doesn't. There are sufficent interfaces exposed that
> > > > > > >userspace can bypass the kernel if it so desired.
> > > > > > >
> > > > > > >>The
> > > > > > >>problem is that without setting the buffer to be cacheable the
> > > > > > >>expectation is that we won't be snooping and incur the corresponding
> > > > > > >>overhead. This is what this patch addresses.
> > > > > > >
> > > > > > >Not true.
> > > > > > >
> > > > > > >>The bit is also bogus, if we wanted snooping via MOCS we'd use the
> > > > > > >>dedicated HW flag for that.
> > > > > > >
> > > > > > >But you keep saying this bit *enables* snooping. So either it does or it
> > > > > > >doesn't.
> > > > > > >
> > > > > > >>If we wanted to have a snooping MOCS entry we should add that
> > > > > > >>separately (as a forth entry), but we'd need this change as a fix for
> > > > > > >>current users.
> > > > > > >
> > > > > > >The current users who are getting what they request but don't know what
> > > > > > >they were requesting?
> > > > > > 
> > > > > > What this kernel ABI (index entry #2) has been agreed & documented to
> > > > > > provide?
> > > > > > 
> > > > > > I thought this entry is supposed to replace the writeback LLC/eLLC cache
> > > > > > MOCS setting Mesa is using on (e.g. BDW) to speed up accesses to a memory
> > > > > > area which it knows always to be accessed so that it can be cached.
> > > > > > 
> > > > > > If app runs on HW where LLC/eLLC is missing, giving the app extra slowdown
> > > > > > instead of potential speedup sounds like failed HW abstraction. :-)
> > > > > 
> > > > > Well mesa needs to know llc vs. !llc anyway to not totally suck, and
> > > > > defining entry #2 as "coherent, always" makes sense. I thought entry 0 was
> > > > > the reaonable default aka pte passthrough and hence managed by kernel?
> > > > 
> > > > Nope, we fscked that up somewhat, and entry 1 is the PTE one :( So if
> > > > userspace forgets to set MOCS on gen9 it won't get the behaviour it
> > > > would have gotten on previous gens.
> > > 
> > > How do we manage to pass the various set_caching vs. rendercpy test then
> > > on skl? Or do we just not, and everyone is still lalala?
> > 
> > Why wouldn't it pass? It's all coherent anyway. Although apparently the
> > docs have some confusing comments that things wouldn't be coherent unless
> > you enable LLC caching, but maybe that's just about coherent L3 or
> > something?
> 
> We have at least one that uses cache control by the kernel and makes sure
> (using crcs) that the stuff actually lands. If we accidentally do coherent
> access instead of wc cache dirt might end up in cpu caches. At least it
> seems pretty fishy.

I don't recall a test dealing with rendercopy + set_caching + display.

> 
> > > Sounds like something to fix either way.
> > > 
> > > > > If mesa asks for nonsense, the kernel is happy to oblige.
> > > > 
> > > > We never really defined what entry 2 actually means: coherent or sane
> > > > performance. Mesa has, rightfully IMO, made the assumption that it means
> > > > the latter since we never set out to define any MOCS entries with
> > > > coherency in mind. And seeing that it's already out in the wild, I think
> > > > it's better to respect it. If we change it now we would just make it
> > > > more painful for people when they get their hands on the hardware.
> > > > 
> > > > I think what we should do is define what the MOCS indexes mean in some
> > > > uapi header. Then there would be no ambiguity.
> > > 
> > > Problem with encoding this is that sooner or later we're playing a game of
> > > whack-a-mole where kernel second-guesses mesa to support old versions, and
> > > mesa tries to work around kernel assumptions that no longer fit reworked
> > > code. "It's the easiest solution for existing binaries" is imo very poor
> > > justification for ABI, especially when it's just about performance.
> > > Fundamentally mesa still works correctly, just a bit slow.
> > > 
> > > I don't want to merge a random hack just because.
> > 
> > I don't see it as any kind of hack. That's the ABI we agreed on pretty
> > much in my opinion. No one had any concerns about coherency when the
> > current MOCS entries were allocated IIRC.
> 
> Hm, not being concerned about coherency when discussing MOCS sounds like a
> pretty big review oversight.

Perhaps. Actually I can't really remeber if such arguments were raised or
not. But at least I don't remember anything of the sort. My impressions is
that the whole thing dragged on for so long that everyone was just happy
to get something usable in.

Anyways, I think it's silly to keep arguing about this since we have no
actual user for any coherent MOCS entries.

-- 
Ville Syrjälä
Intel OTC


More information about the Intel-gfx mailing list