[Mesa-dev] [PATCH 2/2] i965: add runtime check for SSSE3 rgba8_copy

Timothy Arceri t_arceri at yahoo.com.au
Fri Nov 7 04:03:30 PST 2014


On Thu, 2014-11-06 at 19:30 -0500, Frank Henigman wrote:
> I tested your patch with the "teximage" program in mesa demos, the
> same thing I used to benchmark when I developed this code.
> As Matt and Chad point out, the odd-looking _faster functions are
> there for a reason.  Your change causes a huge slowdown.

Yes I should have known better than to assume it was left over code. I
didn't know that gcc could inline memcpy like that, very nice. In fact I
was reading a blog just last week that was saying msvc was better than
gcc for memcpy because gcc was reliant on a library implementation. A
good reminder not to believe everything you read on the internet.

Anyway I've had another go at it and the performance regression should
be fixed. In my testing I couldn't spot any real difference. The main
down side is the ssse3 code can't be inlined so there will be a small
trade off compared to the current way of building with ssse3 enabled.

Also thanks for pointing out "teximage" I didn't know the mesa demos
contained pref tools. 

> I tested on a sandybridge system with a "Intel(R) Celeron(R) CPU 857 @
> 1.20GHz."  Mesa compiled with -O2.
> 
> original code:
>   TexSubImage(RGBA/ubyte 256 x 256): 9660.4 images/sec, 2415.1 MB/sec
>   TexSubImage(RGBA/ubyte 1024 x 1024): 821.2 images/sec, 3284.7 MB/sec
>   TexSubImage(RGBA/ubyte 4096 x 4096): 76.3 images/sec, 4884.9 MB/sec
> 
>   TexSubImage(BGRA/ubyte 256 x 256): 11307.1 images/sec, 2826.8 MB/sec
>   TexSubImage(BGRA/ubyte 1024 x 1024): 944.6 images/sec, 3778.6 MB/sec
>   TexSubImage(BGRA/ubyte 4096 x 4096): 76.7 images/sec, 4908.3 MB/sec
> 
>   TexSubImage(L/ubyte 256 x 256): 17847.5 images/sec, 1115.5 MB/sec
>   TexSubImage(L/ubyte 1024 x 1024): 3068.2 images/sec, 3068.2 MB/sec
>   TexSubImage(L/ubyte 4096 x 4096): 224.6 images/sec, 3593.0 MB/sec
> 
> your code:
>   TexSubImage(RGBA/ubyte 256 x 256): 3271.6 images/sec, 817.9 MB/sec
>   TexSubImage(RGBA/ubyte 1024 x 1024): 232.3 images/sec, 929.2 MB/sec
>   TexSubImage(RGBA/ubyte 4096 x 4096): 47.5 images/sec, 3038.6 MB/sec
> 
>   TexSubImage(BGRA/ubyte 256 x 256): 2426.5 images/sec, 606.6 MB/sec
>   TexSubImage(BGRA/ubyte 1024 x 1024): 164.1 images/sec, 656.4 MB/sec
>   TexSubImage(BGRA/ubyte 4096 x 4096): 13.4 images/sec, 854.8 MB/sec
> 
>   TexSubImage(L/ubyte 256 x 256): 9514.5 images/sec, 594.7 MB/sec
>   TexSubImage(L/ubyte 1024 x 1024): 864.1 images/sec, 864.1 MB/sec
>   TexSubImage(L/ubyte 4096 x 4096): 59.7 images/sec, 955.2 MB/sec
> 
> This is just one run, not an average, but you can see it's slower
> across the board up to a factor of around 6.
> Also I couldn't configure the build after your patch.  I think you
> left out a change to configure.ac to define SSSE3_SUPPORTED.
> 




More information about the mesa-dev mailing list