[Pixman] [PATCH] mmx: Use MMX2 intrinsics from xmmintrin.h directly.

Siarhei Siamashka siarhei.siamashka at gmail.com
Sun Oct 11 20:41:22 PDT 2015


On Sun, 11 Oct 2015 14:55:28 -0700
Matt Turner <mattst88 at gmail.com> wrote:

Hello,

Thanks. The patch looks good. In fact, it also allows the MMX code to
be compiled with the Intel Compiler now (previously it was disabled by
the configure check). A few minor things need to be fixed though. See
the comments below.

> We had lots of hacks to handle the inability to include xmmintrin.h
> without compiling with -msse (lest SSE instructions be used in

"lest" -> "lets" ?

> pixman-mmx.c). Some recent version of gcc relaxed this restriction.
> 
> Change configure.ac to test that xmmintrin.h can be included and that we
> can use some intrinsics from it, and remove the work-around code from
> pixman-mmx.c.
> 
> Evidently allows gcc to optimize better as well:
> 
>    text	   data	    bss	    dec	    hex	filename
>  657078	  30848	    680	 688606	  a81de	libpixman-1.so.0.33.3 before
>  656710	  30848	    680	 688238	  a806e	libpixman-1.so.0.33.3 after

It is always a good idea to mention the exact version of gcc in the
commit message. For example, it could help if somebody happens to be
reading this commit message a few years in the future.

As for being able to optimize better. Yes, the "asm" blocks are
treated by the compiler as opaque boxes (with just the input/output
interface specified by constraints). The optimizer has difficulties
generating efficient code if it has to deal with these bubbles. So
it is a good idea to use intrinsics instead of single-instruction
"asm" statements.

Also I'm not completely sure, but now we probably prefer (require?) the
"Signed-off-by" tags in commit messages.

> ---
>  configure.ac        | 15 ++++----------
>  pixman/pixman-mmx.c | 60 +----------------------------------------------------
>  2 files changed, 5 insertions(+), 70 deletions(-)

Nice stats :-)

> 
> diff --git a/configure.ac b/configure.ac
> index 424bfd3..b04cc69 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -347,21 +347,14 @@ AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
>  #error "Need GCC >= 3.4 for MMX intrinsics"
>  #endif
>  #include <mmintrin.h>
> +#include <xmmintrin.h>

We still would want to have this under the USE_X86_MMX ifdef check.
Otherwise crosscompiling for ARM fails:

$ ./configure --host=arm-linux-gnueabihf --disable-libpng --disable-gtk
$ make

pixman-mmx.c:42:23: fatal error: xmmintrin.h: No such file or directory
 #include <xmmintrin.h>
                       ^

>  int main () {
>      __m64 v = _mm_cvtsi32_si64 (1);
>      __m64 w;
>  
> -    /* Some versions of clang will choke on K */
> -    asm ("pshufw %2, %1, %0\n\t"
> -        : "=y" (w)
> -        : "y" (v), "K" (5)
> -    );
> -
> -    /* Some versions of clang will choke on this */
> -    asm ("pmulhuw %1, %0\n\t"
> -	: "+y" (w)
> -	: "y" (v)
> -    );
> +    /* Test some intrinsics from xmmintrin.h */
> +    w = _mm_shuffle_pi16(v, 5);
> +    w = _mm_mulhi_pu16(w, w);
>  
>      return _mm_cvtsi64_si32 (v);
>  }]])], have_mmx_intrinsics=yes)
> diff --git a/pixman/pixman-mmx.c b/pixman/pixman-mmx.c
> index 05c48a4..6bcdee2 100644
> --- a/pixman/pixman-mmx.c
> +++ b/pixman/pixman-mmx.c
> @@ -39,6 +39,7 @@
>  #include <loongson-mmintrin.h>
>  #else
>  #include <mmintrin.h>
> +#include <xmmintrin.h>
>  #endif
>  #include "pixman-private.h"
>  #include "pixman-combine32.h"
> @@ -59,65 +60,6 @@ _mm_empty (void)
>  }
>  #endif
>  
> -#ifdef USE_X86_MMX
> -# if (defined(__SUNPRO_C) || defined(_MSC_VER) || defined(_WIN64))
> -#  include <xmmintrin.h>
> -# else
> -/* We have to compile with -msse to use xmmintrin.h, but that causes SSE
> - * instructions to be generated that we don't want. Just duplicate the
> - * functions we want to use.  */
> -extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_mm_movemask_pi8 (__m64 __A)
> -{
> -    int ret;
> -
> -    asm ("pmovmskb %1, %0\n\t"
> -	: "=r" (ret)
> -	: "y" (__A)
> -    );
> -
> -    return ret;
> -}
> -
> -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_mm_mulhi_pu16 (__m64 __A, __m64 __B)
> -{
> -    asm ("pmulhuw %1, %0\n\t"
> -	: "+y" (__A)
> -	: "y" (__B)
> -    );
> -    return __A;
> -}
> -
> -#  ifdef __OPTIMIZE__
> -extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_mm_shuffle_pi16 (__m64 __A, int8_t const __N)
> -{
> -    __m64 ret;
> -
> -    asm ("pshufw %2, %1, %0\n\t"
> -	: "=y" (ret)
> -	: "y" (__A), "K" (__N)
> -    );
> -
> -    return ret;
> -}
> -#  else
> -#   define _mm_shuffle_pi16(A, N)					\
> -    ({									\
> -	__m64 ret;							\
> -									\
> -	asm ("pshufw %2, %1, %0\n\t"					\
> -	     : "=y" (ret)						\
> -	     : "y" (A), "K" ((const int8_t)N)				\
> -	);								\
> -									\
> -	ret;								\
> -    })
> -#  endif
> -# endif
> -#endif
> -
>  #ifndef _MSC_VER
>  #define _MM_SHUFFLE(fp3,fp2,fp1,fp0) \
>   (((fp3) << 6) | ((fp2) << 4) | ((fp1) << 2) | (fp0))

The _MM_SHUFFLE define can be removed because it triggers some warnings
if <xmmintrin.h> is included:

pixman-mmx.c(64): warning #47: incompatible redefinition of macro "_MM_SHUFFLE" (declared at line 96 of "/opt/intel/composerxe-2013.0.080/compiler/include/xmmintrin.h")
  #define _MM_SHUFFLE(fp3,fp2,fp1,fp0) \
          ^

-- 
Best regards,
Siarhei Siamashka


More information about the Pixman mailing list