[Mesa-dev] [PATCH 5/7] util: Use SSE intrinsics in _mesa_lroundeven{f, }.

Matt Turner mattst88 at gmail.com
Fri Jul 31 18:02:08 PDT 2015


On Fri, Jul 31, 2015 at 5:50 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 01.08.2015 um 01:26 schrieb Matt Turner:
>> gcc actually generates this for us now that we use -fno-math-errno
>> (which is weird, since lrintf()/lrint() don't set errno) but clang still
>> does not. Presumably helps MSVC as well.
>>
>> Reduced .text size by 8.5k with gcc before -fno-math-errno.
>>
>>    text     data      bss      dec      hex  filename
>> 4935850   195136    26192  5157178   4eb13a  i965_dri.so before
>> 4927225   195128    26192  5148545   4e8f81  i965_dri.so after
>> ---
>>  src/util/rounding.h | 13 +++++++++++++
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/src/util/rounding.h b/src/util/rounding.h
>> index 2d00760..e546c9f 100644
>> --- a/src/util/rounding.h
>> +++ b/src/util/rounding.h
>> @@ -26,6 +26,11 @@
>>
>>  #include <math.h>
>>
>> +#ifdef __x86_64__
>> +#include <xmmintrin.h>
>> +#include <emmintrin.h>
>> +#endif
>> +
>>  #ifdef __SSE4_1__
>>  #include <smmintrin.h>
>>  #endif
>> @@ -87,7 +92,11 @@ _mesa_roundeven(double x)
>>  static inline long
>>  _mesa_lroundevenf(float x)
>>  {
>> +#ifdef __x86_64__
>> +   return _mm_cvtss_si64(_mm_load_ss(&x));
> I think you really want _mm_cvtss_si32, not 64. Longs tend to be 32bit.
> _mm_cvtss_si64 would be the equivalent of llrintf.

long is 64-bits on Linux/amd64. Looks like it's 32-bits on x32 and
Windows though.

I guess I need to do

#ifdef __x86_64__
#if LONG_BIT == 64
   return _mm_cvtss_si64(_mm_load_ss(&x));
#elif LONG_BIT == 32
   return _mm_cvtss_si32(_mm_load_ss(&x));
#endif
#endif

I'll change it to that.

Thanks!


More information about the mesa-dev mailing list