[Pixman] [PATCH] configure.ac: fix test for SSE2/SSSE3 assembler support

Oded Gabbay oded.gabbay at gmail.com
Wed Dec 9 05:35:55 PST 2015


On Wed, Dec 9, 2015 at 3:31 PM, Oded Gabbay <oded.gabbay at gmail.com> wrote:
> From: Jonathan Gray <jsg at jsg.id.au>
>
> This patch was originally sent to mesa but it applies to pixman as well.
>
> Change the __m128i variables to be volatile so gcc 4.9 won't optimize
> all of them out with -O1 or greater.  The _mm_set1_epi32/pinsrd calls
> still get optimized out but now there is at least one SSE2/SSSE3
> instruction generated via _mm_max_epu32/pmaxud.  When all of the
> SSE2/SSSE3 instructions got optimized out the configure test would
> incorrectly pass when the compiler supported the intrinsics and the
> assembler didn't support the instructions.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806
> Signed-off-by: Jonathan Gray <jsg at jsg.id.au>
> Reviewed-by: Emil Velikov <emil.velikov at collabora.com>
> Signed-off-by: Oded Gabbay <oded.gabbay at gmail.com>
> ---
>  configure.ac | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index 3a66909..6323b98 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -430,7 +430,7 @@ AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
>  #include <xmmintrin.h>
>  #include <emmintrin.h>
>  int main () {
> -    __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
> +    volatile __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
>         c = _mm_xor_si128 (a, b);
>      return 0;
>  }]])], have_sse2_intrinsics=yes)
> @@ -474,7 +474,7 @@ AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
>  #include <emmintrin.h>
>  #include <tmmintrin.h>
>  int main () {
> -    __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
> +    volatile __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
>      c = _mm_maddubs_epi16 (a, b);
>      return 0;
>  }]])], have_ssse3_intrinsics=yes)
> --
> 2.5.0
>

I now noticed the commit message is a bit not aligned to pixman, so I
will fix it as follows:

Instead of:

The _mm_set1_epi32/pinsrd calls still get optimized out but now there
is at least one SSE2/SSSE3  instruction generated via
_mm_max_epu32/pmaxud.

I should write:

The _mm_set1_epi32/pinsrd calls still get optimized out but now there
is at least one SSE2/SSSE3 instruction generated via
_mm_xor_si128/_mm_maddubs_epi16 respectively.

Oded


More information about the Pixman mailing list