[Mesa-dev] [PATCH 5/7] util: Use SSE intrinsics in _mesa_lroundeven{f, }.
Matt Turner
mattst88 at gmail.com
Fri Jul 31 18:02:08 PDT 2015
On Fri, Jul 31, 2015 at 5:50 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 01.08.2015 um 01:26 schrieb Matt Turner:
>> gcc actually generates this for us now that we use -fno-math-errno
>> (which is weird, since lrintf()/lrint() don't set errno) but clang still
>> does not. Presumably helps MSVC as well.
>>
>> Reduced .text size by 8.5k with gcc before -fno-math-errno.
>>
>> text data bss dec hex filename
>> 4935850 195136 26192 5157178 4eb13a i965_dri.so before
>> 4927225 195128 26192 5148545 4e8f81 i965_dri.so after
>> ---
>> src/util/rounding.h | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/src/util/rounding.h b/src/util/rounding.h
>> index 2d00760..e546c9f 100644
>> --- a/src/util/rounding.h
>> +++ b/src/util/rounding.h
>> @@ -26,6 +26,11 @@
>>
>> #include <math.h>
>>
>> +#ifdef __x86_64__
>> +#include <xmmintrin.h>
>> +#include <emmintrin.h>
>> +#endif
>> +
>> #ifdef __SSE4_1__
>> #include <smmintrin.h>
>> #endif
>> @@ -87,7 +92,11 @@ _mesa_roundeven(double x)
>> static inline long
>> _mesa_lroundevenf(float x)
>> {
>> +#ifdef __x86_64__
>> + return _mm_cvtss_si64(_mm_load_ss(&x));
> I think you really want _mm_cvtss_si32, not 64. Longs tend to be 32bit.
> _mm_cvtss_si64 would be the equivalent of llrintf.
long is 64-bits on Linux/amd64. Looks like it's 32-bits on x32 and
Windows though.
I guess I need to do
#ifdef __x86_64__
#if LONG_BIT == 64
return _mm_cvtss_si64(_mm_load_ss(&x));
#elif LONG_BIT == 32
return _mm_cvtss_si32(_mm_load_ss(&x));
#endif
#endif
I'll change it to that.
Thanks!
More information about the mesa-dev
mailing list