[Intel-gfx] [PATCH] drm/i915: Disable caches for Global GTT.

Daniel Vetter daniel at ffwll.ch
Fri Nov 7 10:33:10 CET 2014


On Thu, Nov 06, 2014 at 10:53:00PM +0200, Ville Syrjälä wrote:
> On Thu, Nov 06, 2014 at 09:09:52PM +0200, Ville Syrjälä wrote:
> > On Thu, Nov 06, 2014 at 10:55:12AM +0000, Chris Wilson wrote:
> > > On Wed, Nov 05, 2014 at 12:11:41PM +0100, Daniel Vetter wrote:
> > > > On Thu, Oct 30, 2014 at 5:18 PM, Rodrigo Vivi <rodrigo.vivi at intel.com> wrote:
> > > > > Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
> > > > > So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
> > > > >
> > > > > MOCS can still be used though. But if userspace is trusting PTE for
> > > > > cache selection the safest thing to do is to let caches disabled.
> > > > >
> > > > > BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
> > > > > so RTL will always use the value corresponding to pat_sel = 000"
> > > > >
> > > > > Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
> > > > > Cc: James Ausmus <james.ausmus at intel.com>
> > > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > > 
> > > > Ok, I think this is -fixes + cc: stable material, but we need to be a
> > > > bit more elaborate on the commit message:
> > > > 
> > > > - System agent ggtt writes (i.e. cpu gtt mmaps) already work before
> > > > this patch, i.e. the same uncached + snooping access like on gen6/7
> > > > seems to be in effect.
> > > > - So this just fixes blitter/render access. Again it looks like it's
> > > > not just uncached access, but uncached + snooping. So we can still
> > > > hold onto all our assumptions wrt cpu clflushing on LLC machines.
> > > > 
> > > > I think this should be both in the commit message and code.
> > > > 
> > > > Chris, please correct if I've summarized this wrongly.
> > > 
> > > Me, I just get confused by the statement that writes through the System
> > > Agent to main memory are snooped.
> > > 
> > > But the statement that GTT + WC is coherent with the display as it was
> > > on SNB+ is certainly true.
> > 
> > I have a supicious mind so I simply had to go and verify this on my
> > IVB. Writes to a LLC cacheable scanout buffer through a GTT mmap did
> > not result in any cache dirt on the screen. So I must admit defeat.
> > 
> > I also did another experiment where I wrote the data through a cpu
> > mmap, which obviously resulted in plenty of cache dirt. But I then
> > followed it with a read through the GTT mmap (obj still mapped as
> > cacheable in GTT), and the data matched what I wrote, and it also
> > cleared the cache dirt from the screen. So any CPU GTT access will
> > snoop the caches and also causes a writeback of the cacheline. I
> > suspect it also invalidates the cacheline, but I didn't verify that
> > in any way.
> > 
> > So I guess I can rest easier now having witnessed it all with mine
> > own eyes :)
> 
> I also repeated the same tests on CHV, and it behaves in a manner
> consistent with its non-LLC heritage ie. CPU GTT access never snoops
> no matter what the PTE says.
> 
> I also frobbed with the PAT entries a bit, and that at least confirms
> that status page writes go through PAT[0] even though we set up the
> PTEs to use PAT[4]. When PTE[0] doesn't have the snoop bit, all we get
> is a GPU "hang", and with the snoop bit set things work fine.
> 
> So I think that for CHV the ppgtt=0 case is perfectly fine with the
> current code because:
> 1) CPU GTT access is fine as it never snoops and we expect it not to,
>    in fact we hand a SIGBUS to anyone who tries this
> 2) GPU access to snooped bo will work since PAT[0] says to snoop
> 3) GPU access to non-snooped bo will work even if PAT[0] says to snoop,
>    but will perhaps be a bit slow due to the extra snoops. Should be
>    able to override with MOCS though and get the speed back.
> 4) Status page must be snooped, so things pretty much fall apart when
>    PAT[0] doesn't have the snoop bit. Well, unless we decide to add
>    explicit cache flushes to status page reads

I think a patch which adds a comment to the chv pat code stating that
pat[0] must be cached would be great. Otherwise someone will go and
"clean" it up I fear ...

> I'm not entirely sure if I should be coserned about the color_adjust
> stuff, especially in the case where MOCS is used. We might end up
> mixing snooped and non-snooped accesses in a way that the color
> adjustment wasn't able to anticipate. I guess I could just try to
> see if I can anger the hardware doing that. But that's a job or
> another day.

tbh I'm not sure any more whether we still need this on vlv/chv - they did
change the snooping logic a lot. Iirc pre-gen5 only could snoop with the
blitter, vlv/chv can also snoop with the render engine. So maybe this has
improved enough that we don't need the color stuff any more.

I guess a good test would be to disable the color stuff on vlv/chv and
see what happens ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list