[Intel-gfx] performance movnti is better than clflush ?

Eric Anholt eric at anholt.net
Fri Jul 24 23:43:36 CEST 2009


On Fri, 2009-07-24 at 19:20 +0800, Ma, Ling wrote:
> Hi All
> 
> I find movnti + mfence is better than clflush as below report shows
> (on core2 platform)
> 
>  
> 
> Size(byte)    movnti(us)   clflush (us)  speedup
> 
> 4k             3.01            3.56        1.182 
> 
> 16k           12.01         14.23        1.184
> 
> 32k           23.93         28.45        1.188 
> 
> 64k            47.92        56.89        1.187
> 
> The code for two cases (only care about alignment):
> 
>  
> 
>   Movnti + mfence                                          clflush
> 
> For (i = 0; i < size; i = i+ 64) {                                 For
> (i = 0; i < size; i = i + 64)
> 
>    __asm__(“movq (addr + i), %rax);
> clflush(addr + i);
> 
>   __asm__(“movntiq %rax,   (addr + i);
> 
> }
> 
> _-asm__ (“mfence”)
> 
>  
> 
> Movnti will invalidate cache line before writing data into write
> combine buffer, at last we may use mfence to
> 
> drain out the left data in write combine buffer, and behavior looks
> like clflush.
> 
> The approach is only fit for small page, when size is bigger than
> about 128k(on my platform),
> 
> movnti + mfence approach get worse because read instruction.
> 
>  
> 
> If theory is right, we can get benefit from many flush operation in
> gem.

It's been a while since I've seen a profile where clflush is
significant, and even in the places where it's taking 5% of the CPU we
should just be using GTT maps instead.  I'd recommend working on any
performance efforts starting from an application you want to optimize
then drilling down to where bottlenecks are.  And use an actual
application for measurement of performance impact -- particularly with
these sorts of cache issues, measuring whole-application performance is
critical for producing meaningful numbers.

-- 
Eric Anholt
eric at anholt.net                         eric.anholt at intel.com


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20090724/61655974/attachment.sig>


More information about the Intel-gfx mailing list