[Mesa-dev] [Mesa-stable] [PATCH] configure.ac: fix test for SSE4.1 assembler support
Oded Gabbay
oded.gabbay at gmail.com
Sun Dec 13 05:23:36 PST 2015
On Sun, Dec 13, 2015 at 11:56 AM, Jonathan Gray <jsg at jsg.id.au> wrote:
> On Sat, Dec 12, 2015 at 06:41:56PM +0000, Emil Velikov wrote:
>> On 10 December 2015 at 08:42, Oded Gabbay <oded.gabbay at gmail.com> wrote:
>> > On Wed, Dec 9, 2015 at 8:30 PM, Matt Turner <mattst88 at gmail.com> wrote:
>> >> On Tue, Dec 8, 2015 at 9:37 PM, Jonathan Gray <jsg at jsg.id.au> wrote:
>> >>> Change the __m128i variables to be volatile so gcc 4.9 won't optimise
>> >>> all of them out with -O1 or greater. The _mm_set1_epi32/pinsrd calls
>> >>> still get optimised out but now there is at least one SSE4.1 instruction
>> >>> generated via _mm_max_epu32/pmaxud. When all of the sse4.1 instructions
>> >>> got optimised out the configure test would incorrectly pass when the
>> >>> compiler supported the intrinsics and the assembler didn't support the
>> >>> instructions.
>> >>>
>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806
>> >>> Signed-off-by: Jonathan Gray <jsg at jsg.id.au>
>> >>> Cc: "11.0 11.1" <mesa-stable at lists.freedesktop.org>
>> >>> ---
>> >>> configure.ac | 2 +-
>> >>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> >>>
>> >>> diff --git a/configure.ac b/configure.ac
>> >>> index 260934d..1d82e47 100644
>> >>> --- a/configure.ac
>> >>> +++ b/configure.ac
>> >>> @@ -384,7 +384,7 @@ CFLAGS="$SSE41_CFLAGS $CFLAGS"
>> >>> AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
>> >>> #include <smmintrin.h>
>> >>> int main () {
>> >>> - __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
>> >>> + volatile __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
>> >>> c = _mm_max_epu32(a, b);
>> >>> return 0;
>> >>
>> >> I would have extracted an int from the result of _mm_max_epu32 and
>> >> returned that instead of 0.
>> >
>> > Instead of the volatile I assume ?
>> >
>> Precisely. If anyone wants to follow on Matt's suggestion we can pick
>> that one as well. I'd like to get a patch for the next stable releases
>> (next Friday for 11.0.x and just after new year for 11.1.1) so I'll
>> take whatever's around :-)
>>
>> -Emil
>
> I avoided that as I wasn't sure if there was a case where autoconf
> cared about the return code. If someone wants to create a new diff
> feel free, I have limited connectivity till the middle of next week.
So I'm not a huge SSE expert, but I tried doing this (remove volatile
and return _mm_cvtsi128_si32 of c):
------------------------
#include <mmintrin.h>
#include <xmmintrin.h>
#include <emmintrin.h>
int main () {
__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
c = _mm_xor_si128 (a, b);
return _mm_cvtsi128_si32(c);
}
-------------------------
When compiling with "gcc -O1 -msse2", gcc is 4.8.5 (from RHEL 7.2), I got:
---------------------
main:
.LFB521:
.cfi_startproc
movl $0, %eax
ret
.cfi_endproc
-------------------
So unless I misunderstood matt's suggestion, I think we *have* to use
the volatile as it forces the compiler to produce pxor and movdqa
assembly commands.
Oded
More information about the mesa-dev
mailing list