[Mesa-dev] [PATCH] util: Fix fallback iround handling of integral inputs

Tue Jul 22 03:26:17 PDT 2014

Roland Scheidegger <sroland at vmware.com> writes:
> Am 21.07.2014 17:53, schrieb Richard Sandiford:
>> lp_build_iround has a fallback case that tries to emulate a round-to-nearest
>> float->int conversion by adding 0.5 and using a truncating fptosi.  For odd
>> numbers in the range [2^23, 2^24], which are already integral, this has
>> the effect of adding 1 to the integer, since the result of adding 0.5 is
>> exactly half way between two representable values and the tie goes to even.
>> This includes the important special case of (float)0xffffff, which is the
>> maximum depth in a z24s8 format.  I.e. calling iround on (float)0xffffff
>> gives 0x1000000 rather than 0xffffff when the fallback is used.
>> 
>> This patch only uses the x+0.5 trick if the ulp of the input is fractional.
>> This still doesn't give ties-to-even semantics, but that doesn't seem to
>> matter much in practice.
>> 
>> Signed-off-by: Richard Sandiford <rsandifo at linux.vnet.ibm.com>
>> ---
>>  src/gallium/auxiliary/gallivm/lp_bld_arit.c | 43 ++++++++++++++++++-----------
>>  1 file changed, 27 insertions(+), 16 deletions(-)
>> 
>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>> index 9ef8628..82ddb5a 100644
>> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>> @@ -2181,25 +2181,36 @@ lp_build_iround(struct lp_build_context *bld,
>>        res = lp_build_round_arch(bld, a, LP_BUILD_ROUND_NEAREST);
>>     }
>>     else {
>> -      LLVMValueRef half;
>> +      struct lp_type int_type = lp_int_type(type);
>> +      LLVMTypeRef vec_type = bld->vec_type;
>> +      unsigned mantissa = lp_mantissa(type);
>> +      unsigned expbits = type.width - mantissa - 1;
>> +      unsigned bias = (1 << (expbits - 1)) - 1;
>> +      LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type,
>> +                               (unsigned long long)1 << (type.width - 1));
>> +      /* Smallest value with an integral ulp */
>> +      LLVMValueRef abslimit = lp_build_const_int_vec(bld->gallivm, type,
>> +                               (bias + mantissa) << mantissa);
>> +      LLVMValueRef half, sign, absa, fractulp;
>>  
>>        half = lp_build_const_vec(bld->gallivm, type, 0.5);
>>  
>> -      if (type.sign) {
>> -         LLVMTypeRef vec_type = bld->vec_type;
>> -         LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type,
>> -                                    (unsigned long long)1 << (type.width - 1));
>> -         LLVMValueRef sign;
>> -
>> -         /* get sign bit */
>> -         sign = LLVMBuildBitCast(builder, a, int_vec_type, "");
>> -         sign = LLVMBuildAnd(builder, sign, mask, "");
>> -
>> -         /* sign * 0.5 */
>> -         half = LLVMBuildBitCast(builder, half, int_vec_type, "");
>> -         half = LLVMBuildOr(builder, sign, half, "");
>> -         half = LLVMBuildBitCast(builder, half, vec_type, "");
>> -      }
>> +      assert(type.sign);
> You can't assert here. It is quite legal to have floats without sign
> type - used to skip things like here when we know the input is positive.
> There might not be all that many callers using such build contexts,
> though some quick glance identified at least two
> (lp_build_coord_repeat_npot_linear_int,
> lp_build_clamped_float_to_unsigned_norm). Though I would have thought
> the latter is where the bug you're seeing is coming from...

Ah, sorry, I hadn't realised that sort of modification could happen.
I just looked at the types that lp_bld_type.h created.

> Also, I'm not a fan of making this more complex in general.
> lp_build_itrunc is most often used for converting texture coords, or

(lp_build_iround rather than lp_build_itrunc?)

> other stuff like srgb conversion, and none of these callers care about
> this, so maybe should distinguish the cases.

As an extra boolean argument?  Yeah, I can try that, although I'll probably
need help deciding what the right argument is for some cases.  Or were you
thinking of something else?

>> +
>> +      /* get sign bit */
>> +      sign = LLVMBuildBitCast(builder, a, int_vec_type, "");
>> +      sign = LLVMBuildAnd(builder, sign, mask, "");
>> +
>> +      /* get "ulp is fractional" */
>> +      absa = lp_build_abs(bld, a);
>> +      absa = LLVMBuildBitCast(builder, absa, int_vec_type, "");
>> +      fractulp = lp_build_compare(bld->gallivm, int_type, PIPE_FUNC_LESS, absa, abslimit);
>> +
>> +      /* (sign * 0.5) & intulp */
>> +      half = LLVMBuildBitCast(builder, half, int_vec_type, "");
>> +      half = LLVMBuildOr(builder, sign, half, "");
>> +      half = LLVMBuildAnd(builder, half, fractulp, "");
>> +      half = LLVMBuildBitCast(builder, half, vec_type, "");
>>  
>>        res = LLVMBuildFAdd(builder, a, half, "");
>>     }
>> 
>
> Though if the caller is lp_build_round, it actually seems not too
> complicated - the abs() from both will be the same, and the comparison
> could be made (nearly) the same too (lp_build_round and friends are
> using 2^24, though my comments are wrong there and it could just use
> 2^23 instead).
> Or would something like adding "largest float smaller than 0.5" first,
> then add the remainder to 0.5 later work too? I guess though if you have
> type.sign that's not just another "fadd" but also another "or".

Hmm, I suppose if we don't care whether x+0.5 goes to x or x+1 (because
what we have now isn't ties-to-even anyway) then we could just add
"largest float smaller than 0.5" instead of 0.5.  I think adding the
remainder would only make a difference for an input of 0.5 or -0.5.

> Oh and this is not "util" code as the commit message mentions.

Doh.

Thanks,
Richard