[Pixman] [ssse3]Optimization for fetch_scanline_x8r8g8b8

Xu, Samuel samuel.xu at intel.com
Fri Sep 3 01:53:47 PDT 2010

Thanks Soren!
Following is my comment. 
Appreciate your explicitly code guide on Sun Studio check, I never has experience on Sun Studio.
After confirming of Sun Studio related, we can send next patch to address rest issues.


> * The 64 bit CPU detection is broken. It doesn't use SSE2 and MMX when
>  SSSE3 is not detected.
Confirmed and will fix in next patch as following :
#endif /* end dt_cpu64() */

#ifdef USE_MMX
#if (!defined(__amd64__) && !defined(__x86_64__) && !defined(_M_AMD64)) 
//32 bit MMX
static pixman_bool_t
pixman_have_mmx (void)
    return mmx_present;
//64 bit MMX
#define pixman_have_mmx() TRUE

#ifdef USE_SSE2
#if (!defined(__amd64__) && !defined(__x86_64__) && !defined(_M_AMD64)) 
//32 bit SSE2
static pixman_bool_t
pixman_have_sse2 (void)
    return sse2_present;
//64 bit SSE2
#define pixman_have_sse2() TRUE

>* Siarhei asked whether it would be possible to unify the 32 and 64
>  bit assembly sources. I don't think you commented on that.
I think it is very difficult to unify 32 and 64 bit assemble src. 

>* We need a more detailed description in the git commit log
Will add more detail in next patch

>* The check for the Sun Studio compiler is wrong. If SSSE3 is only
>  supported in the latest version of Sun Studio, then configure should
>  check for that version.
Could you show us the error information from Sun Studio? I never have experience on Sun Studio. Appreciate your specific code guide on Sun Studio check

>* Will Sun Studio actually accept GNU assembly files?
http://developers.sun.com/sunstudio/support/Ccompare.html mentioned that " Support for GNU-style inline assembly ... "

> * Why check for SSSE3 intrinsics at all? You don't use them for
>   anything. All use of SSSE3 is in assembly files. The thing that
>   needs to be checked for is whether the assembly files will pass
>   through the compiler.
Yes, it is SSSE3 asm check. Confirmed and will fix in next patch

>* Store forwarding

>  - We need some comments in the assembly about the store forwarding
>    that Ma Ling described.
How about this comments added to asm code:
"CPU doesn't check each address bit for src and dest, so read could fail to recognize different address even they are in different page, read operation have to force previous write operation to commit data from store buffer to cache, the process impact performance seriously. This is work around to avoid this cpu limitation."

>  - Siarhei's questions about it should be answered, preferably in
>    those comments:

>     So it is basically a store forwarding aliasing problem which is
>     described in " Store Forwarding" section from "Intel(r) 64
>     and IA-32 Architectures Optimization Reference Manual", right?
The behavior doesn't belong to store forward which is described in " Store Forwarding" section from "Intel(r) 64 and IA-32 Architectures Optimization Reference Manual". That is another HW optimization.

>     Wouldn't just the use of MOVD/MOVSS instructions here also solve
>     this problem?  Store forwarding does not seem to be used for SIMD
>     according to the manual. I haven't benchmarked anything yet
>     though.

>     http://lists.cairographics.org/archives/pixman/2010-August/000425.html

>* About this line in the 64 bit assembly and the corresponding one in
>  the 32 bit assembly:

>   #if (defined(__amd64__) || defined(__x86_64__) ||defined(_M_AMD64))

>   Makoto Kato asked:

>        Why _M_AMD64?  _M_AMD64 macro is for MSVC++ x64 version.  This
>        assembler is for GNU as.

>   You didn't comment on this.
Will remove _M_AMD64 inside asm file in next patch

>* We might as well be consistent with the naming used in the ARM NEON
>  assembly files:

>  - The assembly file should be called "pixman-ssse3-asm.S", or
>    "pixman-ssse3-x86-32-asm.S and pixman-ssse3-x86-64-asm.S" if there
>    is a good reason they can't be unified.
Will change file name in next patch
>  - The function in the assembly should be called something like

>        composite_line_src_x888_8888_ssse3
Will change fun name in next patch
>    "line" to indicate the fact that it doesn't do full 2D, and the
>    "fast_path" in the current name is confusing because it suggest a
>    connection to the plain C 'fast path' implementation (there is no
>    such connection).

>* Cut-and-paste issues in pixman-ssse3:

>  - #include "pixman-combine32.h"
Will remove it in next patch
>  This is not used as far as I can tell

>  - /* PIXMAN_OP_OVER */
>  The two fast paths are both SRC fast paths, not OVER.

>  - /*---------------------------------------------------------------------
>    * composite_over_8888_n_8888
>    */
>    The function is src_x888_8888(), not over_8888_n_8888(). But the
>   comment in unneccessary since there is only one function.

Will remove it in next patch

More information about the Pixman mailing list