[Mesa-dev] [PATCH] util: Fix fallback iround handling of integral inputs

Wed Jul 23 03:21:00 PDT 2014

Roland Scheidegger <sroland at vmware.com> writes:
> Am 22.07.2014 12:26, schrieb Richard Sandiford:
>> Roland Scheidegger <sroland at vmware.com> writes:
>>> Am 21.07.2014 17:53, schrieb Richard Sandiford:
>>>> lp_build_iround has a fallback case that tries to emulate a round-to-nearest
>>>> float->int conversion by adding 0.5 and using a truncating fptosi.  For odd
>>>> numbers in the range [2^23, 2^24], which are already integral, this has
>>>> the effect of adding 1 to the integer, since the result of adding 0.5 is
>>>> exactly half way between two representable values and the tie goes to even.
>>>> This includes the important special case of (float)0xffffff, which is the
>>>> maximum depth in a z24s8 format.  I.e. calling iround on (float)0xffffff
>>>> gives 0x1000000 rather than 0xffffff when the fallback is used.
>>>>
>>>> This patch only uses the x+0.5 trick if the ulp of the input is fractional.
>>>> This still doesn't give ties-to-even semantics, but that doesn't seem to
>>>> matter much in practice.
>>>>
>>>> Signed-off-by: Richard Sandiford <rsandifo at linux.vnet.ibm.com>
>>>> ---
>>>>  src/gallium/auxiliary/gallivm/lp_bld_arit.c | 43 ++++++++++++++++++-----------
>>>>  1 file changed, 27 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>>> index 9ef8628..82ddb5a 100644
>>>> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>>> @@ -2181,25 +2181,36 @@ lp_build_iround(struct lp_build_context *bld,
>>>>        res = lp_build_round_arch(bld, a, LP_BUILD_ROUND_NEAREST);
>>>>     }
>>>>     else {
>>>> -      LLVMValueRef half;
>>>> +      struct lp_type int_type = lp_int_type(type);
>>>> +      LLVMTypeRef vec_type = bld->vec_type;
>>>> +      unsigned mantissa = lp_mantissa(type);
>>>> +      unsigned expbits = type.width - mantissa - 1;
>>>> +      unsigned bias = (1 << (expbits - 1)) - 1;
>>>> +      LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type,
>>>> +                               (unsigned long long)1 << (type.width - 1));
>>>> +      /* Smallest value with an integral ulp */
>>>> +      LLVMValueRef abslimit = lp_build_const_int_vec(bld->gallivm, type,
>>>> +                               (bias + mantissa) << mantissa);
>>>> +      LLVMValueRef half, sign, absa, fractulp;
>>>>  
>>>>        half = lp_build_const_vec(bld->gallivm, type, 0.5);
>>>>  
>>>> -      if (type.sign) {
>>>> -         LLVMTypeRef vec_type = bld->vec_type;
>>>> -         LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type,
>>>> -                                    (unsigned long long)1 << (type.width - 1));
>>>> -         LLVMValueRef sign;
>>>> -
>>>> -         /* get sign bit */
>>>> -         sign = LLVMBuildBitCast(builder, a, int_vec_type, "");
>>>> -         sign = LLVMBuildAnd(builder, sign, mask, "");
>>>> -
>>>> -         /* sign * 0.5 */
>>>> -         half = LLVMBuildBitCast(builder, half, int_vec_type, "");
>>>> -         half = LLVMBuildOr(builder, sign, half, "");
>>>> -         half = LLVMBuildBitCast(builder, half, vec_type, "");
>>>> -      }
>>>> +      assert(type.sign);
>>> You can't assert here. It is quite legal to have floats without sign
>>> type - used to skip things like here when we know the input is positive.
>>> There might not be all that many callers using such build contexts,
>>> though some quick glance identified at least two
>>> (lp_build_coord_repeat_npot_linear_int,
>>> lp_build_clamped_float_to_unsigned_norm). Though I would have thought
>>> the latter is where the bug you're seeing is coming from...
>> 
>> Ah, sorry, I hadn't realised that sort of modification could happen.
>> I just looked at the types that lp_bld_type.h created.
>> 
>>> Also, I'm not a fan of making this more complex in general.
>>> lp_build_itrunc is most often used for converting texture coords, or
>> 
>> (lp_build_iround rather than lp_build_itrunc?)
> Yes.
>
>> 
>>> other stuff like srgb conversion, and none of these callers care about
>>> this, so maybe should distinguish the cases.
>> 
>> As an extra boolean argument?  Yeah, I can try that, although I'll probably
>> need help deciding what the right argument is for some cases.  Or were you
>> thinking of something else?
> Yes that's what I was thinking. Or otherwise use a different function.
> Both solutions are kind of ugly though admittedly.

Yeah, would prefer to try the other alternative first.

>>>> +
>>>> +      /* get sign bit */
>>>> +      sign = LLVMBuildBitCast(builder, a, int_vec_type, "");
>>>> +      sign = LLVMBuildAnd(builder, sign, mask, "");
>>>> +
>>>> +      /* get "ulp is fractional" */
>>>> +      absa = lp_build_abs(bld, a);
>>>> +      absa = LLVMBuildBitCast(builder, absa, int_vec_type, "");
>>>> +      fractulp = lp_build_compare(bld->gallivm, int_type, PIPE_FUNC_LESS, absa, abslimit);
>>>> +
>>>> +      /* (sign * 0.5) & intulp */
>>>> +      half = LLVMBuildBitCast(builder, half, int_vec_type, "");
>>>> +      half = LLVMBuildOr(builder, sign, half, "");
>>>> +      half = LLVMBuildAnd(builder, half, fractulp, "");
>>>> +      half = LLVMBuildBitCast(builder, half, vec_type, "");
>>>>  
>>>>        res = LLVMBuildFAdd(builder, a, half, "");
>>>>     }
>>>>
>>>
>>> Though if the caller is lp_build_round, it actually seems not too
>>> complicated - the abs() from both will be the same, and the comparison
>>> could be made (nearly) the same too (lp_build_round and friends are
>>> using 2^24, though my comments are wrong there and it could just use
>>> 2^23 instead).
>>> Or would something like adding "largest float smaller than 0.5" first,
>>> then add the remainder to 0.5 later work too? I guess though if you have
>>> type.sign that's not just another "fadd" but also another "or".
>> 
>> Hmm, I suppose if we don't care whether x+0.5 goes to x or x+1 (because
>> what we have now isn't ties-to-even anyway) then we could just add
>> "largest float smaller than 0.5" instead of 0.5.  I think adding the
>> remainder would only make a difference for an input of 0.5 or -0.5.
> I am not entirely sure if it would work ok, maybe. We've previously
> discovered some of that stuff used when doing float->int tex coord
> conversion
> was very sensitive concerning the values in-between two integral values
> using
> some internal tests (though unsure why exactly as full float precision
> should not be required). Might have been for nearest filtering though
> (which now uses ifloor).

FWIW the attached passed testing on piglit but I realise your concern
was more about other tests.

Thanks,
Richard

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-gallivm-Fix-fallback-iround-handling-of-integral-inp.patch
Type: text/x-patch
Size: 3380 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140723/6d38af8d/attachment-0001.bin>