[Mesa-dev] [PATCH] util: Fix fallback iround handling of integral inputs
Richard Sandiford
rsandifo at linux.vnet.ibm.com
Tue Jul 22 03:26:17 PDT 2014
Roland Scheidegger <sroland at vmware.com> writes:
> Am 21.07.2014 17:53, schrieb Richard Sandiford:
>> lp_build_iround has a fallback case that tries to emulate a round-to-nearest
>> float->int conversion by adding 0.5 and using a truncating fptosi. For odd
>> numbers in the range [2^23, 2^24], which are already integral, this has
>> the effect of adding 1 to the integer, since the result of adding 0.5 is
>> exactly half way between two representable values and the tie goes to even.
>> This includes the important special case of (float)0xffffff, which is the
>> maximum depth in a z24s8 format. I.e. calling iround on (float)0xffffff
>> gives 0x1000000 rather than 0xffffff when the fallback is used.
>>
>> This patch only uses the x+0.5 trick if the ulp of the input is fractional.
>> This still doesn't give ties-to-even semantics, but that doesn't seem to
>> matter much in practice.
>>
>> Signed-off-by: Richard Sandiford <rsandifo at linux.vnet.ibm.com>
>> ---
>> src/gallium/auxiliary/gallivm/lp_bld_arit.c | 43 ++++++++++++++++++-----------
>> 1 file changed, 27 insertions(+), 16 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>> index 9ef8628..82ddb5a 100644
>> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>> @@ -2181,25 +2181,36 @@ lp_build_iround(struct lp_build_context *bld,
>> res = lp_build_round_arch(bld, a, LP_BUILD_ROUND_NEAREST);
>> }
>> else {
>> - LLVMValueRef half;
>> + struct lp_type int_type = lp_int_type(type);
>> + LLVMTypeRef vec_type = bld->vec_type;
>> + unsigned mantissa = lp_mantissa(type);
>> + unsigned expbits = type.width - mantissa - 1;
>> + unsigned bias = (1 << (expbits - 1)) - 1;
>> + LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type,
>> + (unsigned long long)1 << (type.width - 1));
>> + /* Smallest value with an integral ulp */
>> + LLVMValueRef abslimit = lp_build_const_int_vec(bld->gallivm, type,
>> + (bias + mantissa) << mantissa);
>> + LLVMValueRef half, sign, absa, fractulp;
>>
>> half = lp_build_const_vec(bld->gallivm, type, 0.5);
>>
>> - if (type.sign) {
>> - LLVMTypeRef vec_type = bld->vec_type;
>> - LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type,
>> - (unsigned long long)1 << (type.width - 1));
>> - LLVMValueRef sign;
>> -
>> - /* get sign bit */
>> - sign = LLVMBuildBitCast(builder, a, int_vec_type, "");
>> - sign = LLVMBuildAnd(builder, sign, mask, "");
>> -
>> - /* sign * 0.5 */
>> - half = LLVMBuildBitCast(builder, half, int_vec_type, "");
>> - half = LLVMBuildOr(builder, sign, half, "");
>> - half = LLVMBuildBitCast(builder, half, vec_type, "");
>> - }
>> + assert(type.sign);
> You can't assert here. It is quite legal to have floats without sign
> type - used to skip things like here when we know the input is positive.
> There might not be all that many callers using such build contexts,
> though some quick glance identified at least two
> (lp_build_coord_repeat_npot_linear_int,
> lp_build_clamped_float_to_unsigned_norm). Though I would have thought
> the latter is where the bug you're seeing is coming from...
Ah, sorry, I hadn't realised that sort of modification could happen.
I just looked at the types that lp_bld_type.h created.
> Also, I'm not a fan of making this more complex in general.
> lp_build_itrunc is most often used for converting texture coords, or
(lp_build_iround rather than lp_build_itrunc?)
> other stuff like srgb conversion, and none of these callers care about
> this, so maybe should distinguish the cases.
As an extra boolean argument? Yeah, I can try that, although I'll probably
need help deciding what the right argument is for some cases. Or were you
thinking of something else?
>> +
>> + /* get sign bit */
>> + sign = LLVMBuildBitCast(builder, a, int_vec_type, "");
>> + sign = LLVMBuildAnd(builder, sign, mask, "");
>> +
>> + /* get "ulp is fractional" */
>> + absa = lp_build_abs(bld, a);
>> + absa = LLVMBuildBitCast(builder, absa, int_vec_type, "");
>> + fractulp = lp_build_compare(bld->gallivm, int_type, PIPE_FUNC_LESS, absa, abslimit);
>> +
>> + /* (sign * 0.5) & intulp */
>> + half = LLVMBuildBitCast(builder, half, int_vec_type, "");
>> + half = LLVMBuildOr(builder, sign, half, "");
>> + half = LLVMBuildAnd(builder, half, fractulp, "");
>> + half = LLVMBuildBitCast(builder, half, vec_type, "");
>>
>> res = LLVMBuildFAdd(builder, a, half, "");
>> }
>>
>
> Though if the caller is lp_build_round, it actually seems not too
> complicated - the abs() from both will be the same, and the comparison
> could be made (nearly) the same too (lp_build_round and friends are
> using 2^24, though my comments are wrong there and it could just use
> 2^23 instead).
> Or would something like adding "largest float smaller than 0.5" first,
> then add the remainder to 0.5 later work too? I guess though if you have
> type.sign that's not just another "fadd" but also another "or".
Hmm, I suppose if we don't care whether x+0.5 goes to x or x+1 (because
what we have now isn't ties-to-even anyway) then we could just add
"largest float smaller than 0.5" instead of 0.5. I think adding the
remainder would only make a difference for an input of 0.5 or -0.5.
> Oh and this is not "util" code as the commit message mentions.
Doh.
Thanks,
Richard
More information about the mesa-dev
mailing list