[Intel-gfx] [PATCH v2 2/2] drm/i915/bxt: Fix inadvertent CPU snooping due to incorrect MOCS config

Eero Tamminen eero.t.tamminen at intel.com
Fri Apr 29 08:01:41 UTC 2016


Hi,

On 27.04.2016 17:53, Chris Wilson wrote:
> On Wed, Apr 27, 2016 at 04:25:09PM +0300, Eero Tamminen wrote:
[...]
>> Daniel, Chris, did you have some concrete example in mind where 3D
>> driver would require CPU to snoop GPU?
>
> Not mesa, but X can do concurrent rendering to a Pixmap whilst also
> rendering from other parts of that Pixmap into a GPU side buffer and
> presentation/compositing thereof. X uses snooping both ways (from client
> memory to GPU and from GPU to client memory) as well as mixed rendering.

Is that something your "sna/gen9: Quick and dirty implementation" for X 
DDX does & does it expect index #2 to be coherent:
https://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=4e172a38e1707465c189c56bdb7ee4bdaf54c9d4
?

<aside>
While it on SKL improves the trivial GpuTest Triangle case by 50% and 
some more realistic cases up to ~20%, it regresses many other cases, up 
to 25%.

Martin bisected that while ago, but I'm not sure whether he's mailed you 
about it yet.  We don't know what the difference was on BXT, as we 
didn't HW for testing it.
</aside>


> Mesa should be using snooping for both SubTexImage and GetTexImage. On
> the SubTexImage path you can use the sampler to do format conversions
> that even including the sync overhead for correctness when using client
> memory avoid the awful format conversion code in mesa. Using the GPU to
> write into client memory and avoiding WC reads is approximately an
> order of magnitude (8x) faster than the current code mesa uses.

How did you arrive at the 8x speedup?  Did you calculate it (how?) or do 
you have a test that shows this speedup?

Disabling snooping on BXT increased the GPU read memory bandwidth by 
*>70%* in Imre's tests.


	- Eero



More information about the Intel-gfx mailing list