[Mesa-dev] [PATCH] util: Fix fallback iround handling of integral inputs

Tue Jul 22 08:31:12 PDT 2014

Am 22.07.2014 12:26, schrieb Richard Sandiford:
> Roland Scheidegger <sroland at vmware.com> writes:
>> Am 21.07.2014 17:53, schrieb Richard Sandiford:
>>> lp_build_iround has a fallback case that tries to emulate a round-to-nearest
>>> float->int conversion by adding 0.5 and using a truncating fptosi.  For odd
>>> numbers in the range [2^23, 2^24], which are already integral, this has
>>> the effect of adding 1 to the integer, since the result of adding 0.5 is
>>> exactly half way between two representable values and the tie goes to even.
>>> This includes the important special case of (float)0xffffff, which is the
>>> maximum depth in a z24s8 format.  I.e. calling iround on (float)0xffffff
>>> gives 0x1000000 rather than 0xffffff when the fallback is used.
>>>
>>> This patch only uses the x+0.5 trick if the ulp of the input is fractional.
>>> This still doesn't give ties-to-even semantics, but that doesn't seem to
>>> matter much in practice.
>>>
>>> Signed-off-by: Richard Sandiford <rsandifo at linux.vnet.ibm.com>
>>> ---
>>>  src/gallium/auxiliary/gallivm/lp_bld_arit.c | 43 ++++++++++++++++++-----------
>>>  1 file changed, 27 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>> index 9ef8628..82ddb5a 100644
>>> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>> @@ -2181,25 +2181,36 @@ lp_build_iround(struct lp_build_context *bld,
>>>        res = lp_build_round_arch(bld, a, LP_BUILD_ROUND_NEAREST);
>>>     }
>>>     else {
>>> -      LLVMValueRef half;
>>> +      struct lp_type int_type = lp_int_type(type);
>>> +      LLVMTypeRef vec_type = bld->vec_type;
>>> +      unsigned mantissa = lp_mantissa(type);
>>> +      unsigned expbits = type.width - mantissa - 1;
>>> +      unsigned bias = (1 << (expbits - 1)) - 1;
>>> +      LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type,
>>> +                               (unsigned long long)1 << (type.width - 1));
>>> +      /* Smallest value with an integral ulp */
>>> +      LLVMValueRef abslimit = lp_build_const_int_vec(bld->gallivm, type,
>>> +                               (bias + mantissa) << mantissa);
>>> +      LLVMValueRef half, sign, absa, fractulp;
>>>  
>>>        half = lp_build_const_vec(bld->gallivm, type, 0.5);
>>>  
>>> -      if (type.sign) {
>>> -         LLVMTypeRef vec_type = bld->vec_type;
>>> -         LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type,
>>> -                                    (unsigned long long)1 << (type.width - 1));
>>> -         LLVMValueRef sign;
>>> -
>>> -         /* get sign bit */
>>> -         sign = LLVMBuildBitCast(builder, a, int_vec_type, "");
>>> -         sign = LLVMBuildAnd(builder, sign, mask, "");
>>> -
>>> -         /* sign * 0.5 */
>>> -         half = LLVMBuildBitCast(builder, half, int_vec_type, "");
>>> -         half = LLVMBuildOr(builder, sign, half, "");
>>> -         half = LLVMBuildBitCast(builder, half, vec_type, "");
>>> -      }
>>> +      assert(type.sign);
>> You can't assert here. It is quite legal to have floats without sign
>> type - used to skip things like here when we know the input is positive.
>> There might not be all that many callers using such build contexts,
>> though some quick glance identified at least two
>> (lp_build_coord_repeat_npot_linear_int,
>> lp_build_clamped_float_to_unsigned_norm). Though I would have thought
>> the latter is where the bug you're seeing is coming from...
> 
> Ah, sorry, I hadn't realised that sort of modification could happen.
> I just looked at the types that lp_bld_type.h created.
> 
>> Also, I'm not a fan of making this more complex in general.
>> lp_build_itrunc is most often used for converting texture coords, or
> 
> (lp_build_iround rather than lp_build_itrunc?)
Yes.

> 
>> other stuff like srgb conversion, and none of these callers care about
>> this, so maybe should distinguish the cases.
> 
> As an extra boolean argument?  Yeah, I can try that, although I'll probably
> need help deciding what the right argument is for some cases.  Or were you
> thinking of something else?
Yes that's what I was thinking. Or otherwise use a different function.
Both solutions are kind of ugly though admittedly.

> 
>>> +
>>> +      /* get sign bit */
>>> +      sign = LLVMBuildBitCast(builder, a, int_vec_type, "");
>>> +      sign = LLVMBuildAnd(builder, sign, mask, "");
>>> +
>>> +      /* get "ulp is fractional" */
>>> +      absa = lp_build_abs(bld, a);
>>> +      absa = LLVMBuildBitCast(builder, absa, int_vec_type, "");
>>> +      fractulp = lp_build_compare(bld->gallivm, int_type, PIPE_FUNC_LESS, absa, abslimit);
>>> +
>>> +      /* (sign * 0.5) & intulp */
>>> +      half = LLVMBuildBitCast(builder, half, int_vec_type, "");
>>> +      half = LLVMBuildOr(builder, sign, half, "");
>>> +      half = LLVMBuildAnd(builder, half, fractulp, "");
>>> +      half = LLVMBuildBitCast(builder, half, vec_type, "");
>>>  
>>>        res = LLVMBuildFAdd(builder, a, half, "");
>>>     }
>>>
>>
>> Though if the caller is lp_build_round, it actually seems not too
>> complicated - the abs() from both will be the same, and the comparison
>> could be made (nearly) the same too (lp_build_round and friends are
>> using 2^24, though my comments are wrong there and it could just use
>> 2^23 instead).
>> Or would something like adding "largest float smaller than 0.5" first,
>> then add the remainder to 0.5 later work too? I guess though if you have
>> type.sign that's not just another "fadd" but also another "or".
> 
> Hmm, I suppose if we don't care whether x+0.5 goes to x or x+1 (because
> what we have now isn't ties-to-even anyway) then we could just add
> "largest float smaller than 0.5" instead of 0.5.  I think adding the
> remainder would only make a difference for an input of 0.5 or -0.5.
I am not entirely sure if it would work ok, maybe. We've previously
discovered some of that stuff used when doing float->int tex coord
conversion
was very sensitive concerning the values in-between two integral values
using
some internal tests (though unsure why exactly as full float precision
should not be required). Might have been for nearest filtering though
(which now uses ifloor).

Roland

>> Oh and this is not "util" code as the commit message mentions.
> 
> Doh.
> 
> Thanks,
> Richard
>