[Intel-gfx] [PATCH] drm/i915: Disable caches for Global GTT.

Ville Syrjälä ville.syrjala at linux.intel.com
Thu Nov 6 21:53:00 CET 2014


On Thu, Nov 06, 2014 at 09:09:52PM +0200, Ville Syrjälä wrote:
> On Thu, Nov 06, 2014 at 10:55:12AM +0000, Chris Wilson wrote:
> > On Wed, Nov 05, 2014 at 12:11:41PM +0100, Daniel Vetter wrote:
> > > On Thu, Oct 30, 2014 at 5:18 PM, Rodrigo Vivi <rodrigo.vivi at intel.com> wrote:
> > > > Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> > > > So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
> > > >
> > > > MOCS can still be used though. But if userspace is trusting PTE for
> > > > cache selection the safest thing to do is to let caches disabled.
> > > >
> > > > BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> > > > so RTL will always use the value corresponding to pat_sel = 000"
> > > >
> > > > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> > > > Cc: James Ausmus <james.ausmus at intel.com>
> > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > 
> > > Ok, I think this is -fixes + cc: stable material, but we need to be a
> > > bit more elaborate on the commit message:
> > > 
> > > - System agent ggtt writes (i.e. cpu gtt mmaps) already work before
> > > this patch, i.e. the same uncached + snooping access like on gen6/7
> > > seems to be in effect.
> > > - So this just fixes blitter/render access. Again it looks like it's
> > > not just uncached access, but uncached + snooping. So we can still
> > > hold onto all our assumptions wrt cpu clflushing on LLC machines.
> > > 
> > > I think this should be both in the commit message and code.
> > > 
> > > Chris, please correct if I've summarized this wrongly.
> > 
> > Me, I just get confused by the statement that writes through the System
> > Agent to main memory are snooped.
> > 
> > But the statement that GTT + WC is coherent with the display as it was
> > on SNB+ is certainly true.
> 
> I have a supicious mind so I simply had to go and verify this on my
> IVB. Writes to a LLC cacheable scanout buffer through a GTT mmap did
> not result in any cache dirt on the screen. So I must admit defeat.
> 
> I also did another experiment where I wrote the data through a cpu
> mmap, which obviously resulted in plenty of cache dirt. But I then
> followed it with a read through the GTT mmap (obj still mapped as
> cacheable in GTT), and the data matched what I wrote, and it also
> cleared the cache dirt from the screen. So any CPU GTT access will
> snoop the caches and also causes a writeback of the cacheline. I
> suspect it also invalidates the cacheline, but I didn't verify that
> in any way.
> 
> So I guess I can rest easier now having witnessed it all with mine
> own eyes :)

I also repeated the same tests on CHV, and it behaves in a manner
consistent with its non-LLC heritage ie. CPU GTT access never snoops
no matter what the PTE says.

I also frobbed with the PAT entries a bit, and that at least confirms
that status page writes go through PAT[0] even though we set up the
PTEs to use PAT[4]. When PTE[0] doesn't have the snoop bit, all we get
is a GPU "hang", and with the snoop bit set things work fine.

So I think that for CHV the ppgtt=0 case is perfectly fine with the
current code because:
1) CPU GTT access is fine as it never snoops and we expect it not to,
   in fact we hand a SIGBUS to anyone who tries this
2) GPU access to snooped bo will work since PAT[0] says to snoop
3) GPU access to non-snooped bo will work even if PAT[0] says to snoop,
   but will perhaps be a bit slow due to the extra snoops. Should be
   able to override with MOCS though and get the speed back.
4) Status page must be snooped, so things pretty much fall apart when
   PAT[0] doesn't have the snoop bit. Well, unless we decide to add
   explicit cache flushes to status page reads

I'm not entirely sure if I should be coserned about the color_adjust
stuff, especially in the case where MOCS is used. We might end up
mixing snooped and non-snooped accesses in a way that the color
adjustment wasn't able to anticipate. I guess I could just try to
see if I can anger the hardware doing that. But that's a job or
another day.

-- 
Ville Syrjälä
Intel OTC



More information about the Intel-gfx mailing list