[Mesa-dev] [PATCH] softpipe: Do round-to-even, not round-up.

Brian Paul brianp at vmware.com
Fri May 18 07:55:39 PDT 2012


On 05/18/2012 03:43 AM, Olivier Galibert wrote:
> Fixes the piglit roundEven tests.
>
> Signed-off-by: Olivier Galibert<galibert at pobox.com>
> ---
>   src/gallium/auxiliary/tgsi/tgsi_exec.c |   76 ++++++++++++++++++++++++++++++--
>   1 file changed, 72 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c
> index 5e23f5d..7311f5f 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
> @@ -316,14 +316,82 @@ micro_rcp(union tgsi_exec_channel *dst,
>      dst->f[3] = 1.0f / src->f[3];
>   }
>

Doesn't the rint() function do the same thing as round-to-even?
Though, I'm not sure if rint() is available on Windows so we might 
need your ieee754_fp32_round_half_to_even() function anyway.  I'll try 
to look into that.

In any case, I think this function could be moved into u_math.c so it 
could be used elsewhere.


> +static float
> +ieee754_fp32_round_half_to_even(float v)
> +{
> +   unsigned int iv;
> +   int exponent;
> +   unsigned int one_half_bit_mask, fractional_bits_mask;
> +   assert(sizeof(v) == 4);
> +   assert(sizeof(iv) == 4);
> +   memcpy(&iv,&v, 4);
> +
> +   exponent = ((iv>>  23)&  0xff) - 0x7f;
> +
> +   // Integer or nan, return unscathed

We pretty much just use /* */ comments in the gallium C sources.  I'm 
fine with // but if some non-gcc compiler chokes on it we'll have to 
change these.


> +   if (exponent>= 23)
> +      return v;
> +
> +   // abs(v)<  0.5, zero, or denormal, return 0
> +   if (exponent<  -1)
> +      return 0;
> +
> +   // abs(v)>= 0.5 and<  1
> +   if (exponent == -1) {
> +      // Not exactly 0.5 round to the to appropriate 1, otherwise to 0
> +      if (iv&  0x7fffff) {
> +         iv = (iv&  0x80000000) | 0x3f800000;
> +         memcpy(&v,&iv, 4);
> +         return v;
> +      } else
> +         return 0;
> +   }
> +
> +   one_half_bit_mask   = 0x800000>>  (exponent+1);
> +   fractional_bits_mask = 0xffffff>>  (exponent+1);
> +
> +   // Fractional part under 0.5, cutoff the fractional part
> +   if (!(iv&  one_half_bit_mask)) {
> +      iv = iv&  ~fractional_bits_mask;
> +      memcpy(&v,&iv, 4);
> +      return v;
> +   }
> +
> +   // Fractional part over 0.5, round up
> +   if ((iv&  fractional_bits_mask) != one_half_bit_mask) {
> +      // Round up by setting the fractional bits to 1 and incrementing.
> +      // The exponent will be automagically incremented when needed
> +      // through carry propagation.
> +      iv = (iv | fractional_bits_mask) + 1;
> +      memcpy(&v,&iv, 4);
> +      return v;
> +   }
> +
> +   // Fractional part exactly 0.5, round towards even.
> +
> +   // Test the bit just over the decimal point to test for oddness.  It
> +   // works for the borderline case 1.5 because the exponent is 127 in
> +   // that case, i.e. the tested bit is the expected 1.
> +
> +   // Odd, round up
> +   if (iv&  (one_half_bit_mask<<  1))
> +      iv = (iv | fractional_bits_mask) + 1;
> +   // Even, round down
> +   else
> +      iv = iv&  ~fractional_bits_mask;
> +
> +   memcpy(&v,&iv, 4);
> +   return v;
> +}
> +
>   static void
>   micro_rnd(union tgsi_exec_channel *dst,
>             const union tgsi_exec_channel *src)
>   {
> -   dst->f[0] = floorf(src->f[0] + 0.5f);
> -   dst->f[1] = floorf(src->f[1] + 0.5f);
> -   dst->f[2] = floorf(src->f[2] + 0.5f);
> -   dst->f[3] = floorf(src->f[3] + 0.5f);
> +   dst->f[0] = ieee754_fp32_round_half_to_even(src->f[0]);
> +   dst->f[1] = ieee754_fp32_round_half_to_even(src->f[1]);
> +   dst->f[2] = ieee754_fp32_round_half_to_even(src->f[2]);
> +   dst->f[3] = ieee754_fp32_round_half_to_even(src->f[3]);
>   }
>
>   static void

I was looking at the GLSL round() and roundEven() functions.  The GLSL 
spec says round() can use whatever method is fastest.  But in 
builtin_functions.cpp the round() function is implemented in terms of 
the round_even builtin.  It seems to me that we should have a generic 
'round' builtin function and separate TGSI_ROUND and TGSI_ROUND_EVEN 
opcodes so that drivers can really have the option of using a 
faster/looser round function.

That also reminds me, there's a problem with Mesa's IROUND() function...

-Brian


More information about the mesa-dev mailing list