[Intel-gfx] [PATCH] drm/i915: GFDT support for SNB/IVB

Ville Syrjälä ville.syrjala at linux.intel.com
Thu Apr 4 13:27:12 CEST 2013


On Thu, Apr 04, 2013 at 11:01:02AM +0200, Daniel Vetter wrote:
> On Wed, Mar 06, 2013 at 04:28:09PM +0000, Chris Wilson wrote:
> > From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > 
> > Currently all scanout buffers must be uncached because the
> > display controller doesn't snoop the LLC. SNB introduced another
> > method to guarantee coherency for the display controller. It's
> > called the GFDT or graphics data type.
> > 
> > Pages that have the GFDT bit enabled in their PTEs get flushed
> > all the way to memory when a MI_FLUSH_DW or PIPE_CONTROL is
> > issued with the "synchronize GFDT" bit set.
> > 
> > So rather than making all scanout buffers uncached, set the GFDT
> > bit in their PTEs, and modify the ring flush functions to enable
> > the "synchronize GFDT" bit.
> > 
> > On HSW the GFDT bit was removed from the PTE, and it's only present in
> > surface state, so we can't really set it from the kernel. Also the docs
> > state that the hardware isn't actually guaranteed to respect the GFDT
> > bit. So it looks like GFDT isn't all that useful on HSW.
> > 
> > So far I've tried this very quickly on an IVB machine, and
> > it seems to be working as advertised. No idea if it does any
> > good though.
> > 
> > On an i5-2520m (laptop) running gnome-shell at 1366x768:
> >   padman 		140.78 -> 145.98 fps
> >   openarena 		183.72 -> 186.87 fps
> >   gtkperf ComboBoxEntry	20.27 -> 22.14s
> >   gtkperf pixbuf	 1.12 ->  1.47s
> >   x11perf -aa10text	13.40 -> 13.20 Mglyphs
> > which are well within the throttling noise.
> > 
> > v2 [ickle]: adapt to comply with existing userspace guarantees
> > 
> > Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> 
> Oops, this one here somehow fell a bit through the cracks.
> 
> Two bikesheds and one real issue:
> - s/bool gfdt/unsigned caching_flags/ for the gtt pte punchers. I expect
>   more fun to come here ;-)
> - ring->flush has a pile of arguments, I guess we should coalesce
>   invalidate/flush and the new internal into a simple flags array. I don't
>   expect that we'll every switch invalidate/flush back to domain arrays
>   from the current "it's just a bool, really" usage.
> 
> Also, the above should be done in separate prep patches imo.
> 
> The real deal is flushing cpu access, since that probably does not set the
> gfdt flag. So cpu reads are fine and don't require any flushing, but cpu
> writes must be clflushed I think. In a way gfdt works as if the gpu is
> doing write-through caching (safe that we have to manually initiate the
> write-through with the gfdt flush). But from the pov of cpu access it's
> the same, and hence requires the same amount of clflushing.
> 
> Hence I'm leaning towards adding a new I915_CACHING_WT cache_level. Ofc
> for gfdt we still need to keep track of the gfdt_dirty state, but all the
> cpu side flushing business should be much clearer.
> 
> Stumbled over this while checking whether sw_finish_ioctl would still do
> the right thing, and noticed that the clflush is a noop when gfdt is
> treated like the current CACHE_LLC.

My original patch had a change to clflushing where pinned objects w/ gfdt
were also flushed. I didn't really read v2 well enough to figure out why
that got dropped.

-- 
Ville Syrjälä
Intel OTC



More information about the Intel-gfx mailing list