[Mesa-dev] [PATCH] gallivm: use llvm.nearbyint instead of llvm.round

Wed Apr 13 05:51:11 UTC 2016

On 13/04/16 04:00, sroland at vmware.com wrote:
> From: Roland Scheidegger <sroland at vmware.com>
>
> We used to use sse roundps intrinsic directly, but switched to use the llvm
> intrinsics for rounding with e4f01da15d8c6ce3e8c77ff3ff3d2ce2574a3f7b.
> However, llvm semantics follows standard math lib round function which is
> specced to do roundNearestAwayFromZero but we really want roundNearestEven
> (moreoever, using round generates atrocious code since the cpu can't do it
> directly and it results in scalar calls to libm __roundf).
> So, use llvm.nearbyint instead, which does exactly the right thing, and even
> has the advantage of being available with llvm 3.3 too. (I've verified it
> actually generates a roundps instruction with llvm 3.3.)
>
> This fixes https://bugs.freedesktop.org/show_bug.cgi?id=94909

Thanks Roland.

I should've cought that.  I think I forget to actually look at 
llvm.round's assembly as I only found that intrisic much later.  I 
presumed it was roundps.

> ---
>   src/gallium/auxiliary/gallivm/lp_bld_arit.c | 6 +-----
>   1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
> index 0c43617..74f1683 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
> @@ -1863,11 +1863,7 @@ lp_build_round_arch(struct lp_build_context *bld,
>
>         switch (mode) {
>         case LP_BUILD_ROUND_NEAREST:
> -         if (HAVE_LLVM >= 0x0304) {
> -            intrinsic_root = "llvm.round";
> -         } else {
> -            return lp_build_nearest_sse41(bld, a);

You can remove lp_build_nearest_sse41 too while you're at it.

> -         }
> +         intrinsic_root = "llvm.nearbyint";
>            break;
>         case LP_BUILD_ROUND_FLOOR:
>            intrinsic_root = "llvm.floor";
>

We should add a test case to lp_test_arit too.  Like this:

diff --git a/src/gallium/drivers/llvmpipe/lp_test_arit.c 
b/src/gallium/drivers/llvmpipe/lp_test_arit.c
index a0f2db7..ba831f3 100644
--- a/src/gallium/drivers/llvmpipe/lp_test_arit.c
+++ b/src/gallium/drivers/llvmpipe/lp_test_arit.c
@@ -218,6 +218,7 @@ const float round_values[] = {
        -10.0, -1, 0.0, 12.0,
        -1.49, -0.25, 1.25, 2.51,
        -0.99, -0.01, 0.01, 0.99,
+      -1.5, -0.5, 0.5, 1.5,
        1.401298464324817e-45f, // smallest denormal
        -1.401298464324817e-45f,
        1.62981451e-08f,
@@ -283,7 +284,7 @@ unary_tests[] = {
     {"sin", &lp_build_sin, &sinf, sincos_values, 
Elements(sincos_values), 20.0 },
     {"cos", &lp_build_cos, &cosf, sincos_values, 
Elements(sincos_values), 20.0 },
     {"sgn", &lp_build_sgn, &sgnf, exp2_values, Elements(exp2_values), 
20.0 },
-   {"round", &lp_build_round, &roundf, round_values, 
Elements(round_values), 24.0 },
+   {"round", &lp_build_round, &nearbyintf, round_values, 
Elements(round_values), 24.0 },
     {"trunc", &lp_build_trunc, &truncf, round_values, 
Elements(round_values), 24.0 },
     {"floor", &lp_build_floor, &floorf, round_values, 
Elements(round_values), 24.0 },
     {"ceil", &lp_build_ceil, &ceilf, round_values, 
Elements(round_values), 24.0 },


It seems nearbyintf is available on MSVC too.

It's just not clear whether MSVC implementation follows round-to-even: 
the nearbyint doc page is missing from MSDN, and rintf page explicitly 
states that rint(2.5) is 3 per, 
https://msdn.microsoft.com/en-us/library/dn465165.aspx

Jose