[Intel-gfx] [PATCH] drm/i915: GFDT support for SNB/IVB
Daniel Vetter
daniel at ffwll.ch
Thu Apr 4 11:01:02 CEST 2013
On Wed, Mar 06, 2013 at 04:28:09PM +0000, Chris Wilson wrote:
> From: Ville Syrjälä <ville.syrjala at linux.intel.com>
>
> Currently all scanout buffers must be uncached because the
> display controller doesn't snoop the LLC. SNB introduced another
> method to guarantee coherency for the display controller. It's
> called the GFDT or graphics data type.
>
> Pages that have the GFDT bit enabled in their PTEs get flushed
> all the way to memory when a MI_FLUSH_DW or PIPE_CONTROL is
> issued with the "synchronize GFDT" bit set.
>
> So rather than making all scanout buffers uncached, set the GFDT
> bit in their PTEs, and modify the ring flush functions to enable
> the "synchronize GFDT" bit.
>
> On HSW the GFDT bit was removed from the PTE, and it's only present in
> surface state, so we can't really set it from the kernel. Also the docs
> state that the hardware isn't actually guaranteed to respect the GFDT
> bit. So it looks like GFDT isn't all that useful on HSW.
>
> So far I've tried this very quickly on an IVB machine, and
> it seems to be working as advertised. No idea if it does any
> good though.
>
> On an i5-2520m (laptop) running gnome-shell at 1366x768:
> padman 140.78 -> 145.98 fps
> openarena 183.72 -> 186.87 fps
> gtkperf ComboBoxEntry 20.27 -> 22.14s
> gtkperf pixbuf 1.12 -> 1.47s
> x11perf -aa10text 13.40 -> 13.20 Mglyphs
> which are well within the throttling noise.
>
> v2 [ickle]: adapt to comply with existing userspace guarantees
>
> Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Oops, this one here somehow fell a bit through the cracks.
Two bikesheds and one real issue:
- s/bool gfdt/unsigned caching_flags/ for the gtt pte punchers. I expect
more fun to come here ;-)
- ring->flush has a pile of arguments, I guess we should coalesce
invalidate/flush and the new internal into a simple flags array. I don't
expect that we'll every switch invalidate/flush back to domain arrays
from the current "it's just a bool, really" usage.
Also, the above should be done in separate prep patches imo.
The real deal is flushing cpu access, since that probably does not set the
gfdt flag. So cpu reads are fine and don't require any flushing, but cpu
writes must be clflushed I think. In a way gfdt works as if the gpu is
doing write-through caching (safe that we have to manually initiate the
write-through with the gfdt flush). But from the pov of cpu access it's
the same, and hence requires the same amount of clflushing.
Hence I'm leaning towards adding a new I915_CACHING_WT cache_level. Ofc
for gfdt we still need to keep track of the gfdt_dirty state, but all the
cpu side flushing business should be much clearer.
Stumbled over this while checking whether sw_finish_ioctl would still do
the right thing, and noticed that the clflush is a noop when gfdt is
treated like the current CACHE_LLC.
Cheers, Daniel
PS: Ben Widawsky noticed that on ivb we set a stupid ppgtt pte caching
override, which results in pte cachelines being tagged with gfdt. Hsw has
an equivalent mode to allow the gpu to write through. I have no idea
what's the point of this and we never write ptes from the gpu. So can you
please throw the relevant patch on top to disable gfdt for ppgtt ptes in
ECOCHK?
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the Intel-gfx
mailing list