[Mesa-dev] [PATCH 5/7] util: Use SSE intrinsics in _mesa_lroundeven{f, }.

Roland Scheidegger sroland at vmware.com
Fri Jul 31 17:50:42 PDT 2015


Am 01.08.2015 um 01:26 schrieb Matt Turner:
> gcc actually generates this for us now that we use -fno-math-errno
> (which is weird, since lrintf()/lrint() don't set errno) but clang still
> does not. Presumably helps MSVC as well.
> 
> Reduced .text size by 8.5k with gcc before -fno-math-errno.
> 
>    text     data      bss      dec      hex  filename
> 4935850   195136    26192  5157178   4eb13a  i965_dri.so before
> 4927225   195128    26192  5148545   4e8f81  i965_dri.so after
> ---
>  src/util/rounding.h | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/src/util/rounding.h b/src/util/rounding.h
> index 2d00760..e546c9f 100644
> --- a/src/util/rounding.h
> +++ b/src/util/rounding.h
> @@ -26,6 +26,11 @@
>  
>  #include <math.h>
>  
> +#ifdef __x86_64__
> +#include <xmmintrin.h>
> +#include <emmintrin.h>
> +#endif
> +
>  #ifdef __SSE4_1__
>  #include <smmintrin.h>
>  #endif
> @@ -87,7 +92,11 @@ _mesa_roundeven(double x)
>  static inline long
>  _mesa_lroundevenf(float x)
>  {
> +#ifdef __x86_64__
> +   return _mm_cvtss_si64(_mm_load_ss(&x));
I think you really want _mm_cvtss_si32, not 64. Longs tend to be 32bit.
_mm_cvtss_si64 would be the equivalent of llrintf.

> +#else
>     return lrintf(x);
> +#endif
>  }
>  
>  /**
> @@ -97,7 +106,11 @@ _mesa_lroundevenf(float x)
>  static inline long
>  _mesa_lroundeven(double x)
>  {
> +#ifdef __x86_64__
> +   return _mm_cvtsd_si64(_mm_load_sd(&x));
Same here.

> +#else
>     return lrint(x);
> +#endif
>  }
>  
>  #endif
> 



More information about the mesa-dev mailing list