[Mesa-dev] [PATCH 5/7] util: Use SSE intrinsics in _mesa_lroundeven{f, }.
Roland Scheidegger
sroland at vmware.com
Fri Jul 31 17:50:42 PDT 2015
Am 01.08.2015 um 01:26 schrieb Matt Turner:
> gcc actually generates this for us now that we use -fno-math-errno
> (which is weird, since lrintf()/lrint() don't set errno) but clang still
> does not. Presumably helps MSVC as well.
>
> Reduced .text size by 8.5k with gcc before -fno-math-errno.
>
> text data bss dec hex filename
> 4935850 195136 26192 5157178 4eb13a i965_dri.so before
> 4927225 195128 26192 5148545 4e8f81 i965_dri.so after
> ---
> src/util/rounding.h | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/src/util/rounding.h b/src/util/rounding.h
> index 2d00760..e546c9f 100644
> --- a/src/util/rounding.h
> +++ b/src/util/rounding.h
> @@ -26,6 +26,11 @@
>
> #include <math.h>
>
> +#ifdef __x86_64__
> +#include <xmmintrin.h>
> +#include <emmintrin.h>
> +#endif
> +
> #ifdef __SSE4_1__
> #include <smmintrin.h>
> #endif
> @@ -87,7 +92,11 @@ _mesa_roundeven(double x)
> static inline long
> _mesa_lroundevenf(float x)
> {
> +#ifdef __x86_64__
> + return _mm_cvtss_si64(_mm_load_ss(&x));
I think you really want _mm_cvtss_si32, not 64. Longs tend to be 32bit.
_mm_cvtss_si64 would be the equivalent of llrintf.
> +#else
> return lrintf(x);
> +#endif
> }
>
> /**
> @@ -97,7 +106,11 @@ _mesa_lroundevenf(float x)
> static inline long
> _mesa_lroundeven(double x)
> {
> +#ifdef __x86_64__
> + return _mm_cvtsd_si64(_mm_load_sd(&x));
Same here.
> +#else
> return lrint(x);
> +#endif
> }
>
> #endif
>
More information about the mesa-dev
mailing list