[Pixman] [PATCH] mmx: Use MMX2 intrinsics from xmmintrin.h directly.

Matt Turner mattst88 at gmail.com
Sun Oct 11 20:51:40 PDT 2015


On Sun, Oct 11, 2015 at 8:41 PM, Siarhei Siamashka
<siarhei.siamashka at gmail.com> wrote:
> On Sun, 11 Oct 2015 14:55:28 -0700
> Matt Turner <mattst88 at gmail.com> wrote:
>
> Hello,
>
> Thanks. The patch looks good. In fact, it also allows the MMX code to
> be compiled with the Intel Compiler now (previously it was disabled by
> the configure check). A few minor things need to be fixed though. See
> the comments below.
>
>> We had lots of hacks to handle the inability to include xmmintrin.h
>> without compiling with -msse (lest SSE instructions be used in
>
> "lest" -> "lets" ?

Nope, I mean "lest" (means "otherwise something bad would happen")

>> pixman-mmx.c). Some recent version of gcc relaxed this restriction.
>>
>> Change configure.ac to test that xmmintrin.h can be included and that we
>> can use some intrinsics from it, and remove the work-around code from
>> pixman-mmx.c.
>>
>> Evidently allows gcc to optimize better as well:
>>
>>    text          data     bss     dec     hex filename
>>  657078         30848     680  688606   a81de libpixman-1.so.0.33.3 before
>>  656710         30848     680  688238   a806e libpixman-1.so.0.33.3 after
>
> It is always a good idea to mention the exact version of gcc in the
> commit message. For example, it could help if somebody happens to be
> reading this commit message a few years in the future.

Sure, will do.

> As for being able to optimize better. Yes, the "asm" blocks are
> treated by the compiler as opaque boxes (with just the input/output
> interface specified by constraints). The optimizer has difficulties
> generating efficient code if it has to deal with these bubbles. So
> it is a good idea to use intrinsics instead of single-instruction
> "asm" statements.
>
> Also I'm not completely sure, but now we probably prefer (require?) the
> "Signed-off-by" tags in commit messages.

Will do.

>> ---
>>  configure.ac        | 15 ++++----------
>>  pixman/pixman-mmx.c | 60 +----------------------------------------------------
>>  2 files changed, 5 insertions(+), 70 deletions(-)
>
> Nice stats :-)
>
>>
>> diff --git a/configure.ac b/configure.ac
>> index 424bfd3..b04cc69 100644
>> --- a/configure.ac
>> +++ b/configure.ac
>> @@ -347,21 +347,14 @@ AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
>>  #error "Need GCC >= 3.4 for MMX intrinsics"
>>  #endif
>>  #include <mmintrin.h>
>> +#include <xmmintrin.h>
>
> We still would want to have this under the USE_X86_MMX ifdef check.
> Otherwise crosscompiling for ARM fails:
>
> $ ./configure --host=arm-linux-gnueabihf --disable-libpng --disable-gtk
> $ make
>
> pixman-mmx.c:42:23: fatal error: xmmintrin.h: No such file or directory
>  #include <xmmintrin.h>
>                        ^

Heh, can't believe I forgot about that since I added the iwMMXt support. :)

>>  int main () {
>>      __m64 v = _mm_cvtsi32_si64 (1);
>>      __m64 w;
>>
>> -    /* Some versions of clang will choke on K */
>> -    asm ("pshufw %2, %1, %0\n\t"
>> -        : "=y" (w)
>> -        : "y" (v), "K" (5)
>> -    );
>> -
>> -    /* Some versions of clang will choke on this */
>> -    asm ("pmulhuw %1, %0\n\t"
>> -     : "+y" (w)
>> -     : "y" (v)
>> -    );
>> +    /* Test some intrinsics from xmmintrin.h */
>> +    w = _mm_shuffle_pi16(v, 5);
>> +    w = _mm_mulhi_pu16(w, w);
>>
>>      return _mm_cvtsi64_si32 (v);
>>  }]])], have_mmx_intrinsics=yes)
>> diff --git a/pixman/pixman-mmx.c b/pixman/pixman-mmx.c
>> index 05c48a4..6bcdee2 100644
>> --- a/pixman/pixman-mmx.c
>> +++ b/pixman/pixman-mmx.c
>> @@ -39,6 +39,7 @@
>>  #include <loongson-mmintrin.h>
>>  #else
>>  #include <mmintrin.h>
>> +#include <xmmintrin.h>
>>  #endif
>>  #include "pixman-private.h"
>>  #include "pixman-combine32.h"
>> @@ -59,65 +60,6 @@ _mm_empty (void)
>>  }
>>  #endif
>>
>> -#ifdef USE_X86_MMX
>> -# if (defined(__SUNPRO_C) || defined(_MSC_VER) || defined(_WIN64))
>> -#  include <xmmintrin.h>
>> -# else
>> -/* We have to compile with -msse to use xmmintrin.h, but that causes SSE
>> - * instructions to be generated that we don't want. Just duplicate the
>> - * functions we want to use.  */
>> -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>> -_mm_movemask_pi8 (__m64 __A)
>> -{
>> -    int ret;
>> -
>> -    asm ("pmovmskb %1, %0\n\t"
>> -     : "=r" (ret)
>> -     : "y" (__A)
>> -    );
>> -
>> -    return ret;
>> -}
>> -
>> -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>> -_mm_mulhi_pu16 (__m64 __A, __m64 __B)
>> -{
>> -    asm ("pmulhuw %1, %0\n\t"
>> -     : "+y" (__A)
>> -     : "y" (__B)
>> -    );
>> -    return __A;
>> -}
>> -
>> -#  ifdef __OPTIMIZE__
>> -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>> -_mm_shuffle_pi16 (__m64 __A, int8_t const __N)
>> -{
>> -    __m64 ret;
>> -
>> -    asm ("pshufw %2, %1, %0\n\t"
>> -     : "=y" (ret)
>> -     : "y" (__A), "K" (__N)
>> -    );
>> -
>> -    return ret;
>> -}
>> -#  else
>> -#   define _mm_shuffle_pi16(A, N)                                    \
>> -    ({                                                                       \
>> -     __m64 ret;                                                      \
>> -                                                                     \
>> -     asm ("pshufw %2, %1, %0\n\t"                                    \
>> -          : "=y" (ret)                                               \
>> -          : "y" (A), "K" ((const int8_t)N)                           \
>> -     );                                                              \
>> -                                                                     \
>> -     ret;                                                            \
>> -    })
>> -#  endif
>> -# endif
>> -#endif
>> -
>>  #ifndef _MSC_VER
>>  #define _MM_SHUFFLE(fp3,fp2,fp1,fp0) \
>>   (((fp3) << 6) | ((fp2) << 4) | ((fp1) << 2) | (fp0))
>
> The _MM_SHUFFLE define can be removed because it triggers some warnings
> if <xmmintrin.h> is included:
>
> pixman-mmx.c(64): warning #47: incompatible redefinition of macro "_MM_SHUFFLE" (declared at line 96 of "/opt/intel/composerxe-2013.0.080/compiler/include/xmmintrin.h")
>   #define _MM_SHUFFLE(fp3,fp2,fp1,fp0) \
>           ^

Oh, nice. Will do.


More information about the Pixman mailing list