[Mesa-dev] [PATCH] util: Fix fallback iround handling of integral inputs
Richard Sandiford
rsandifo at linux.vnet.ibm.com
Wed Jul 23 03:21:00 PDT 2014
Roland Scheidegger <sroland at vmware.com> writes:
> Am 22.07.2014 12:26, schrieb Richard Sandiford:
>> Roland Scheidegger <sroland at vmware.com> writes:
>>> Am 21.07.2014 17:53, schrieb Richard Sandiford:
>>>> lp_build_iround has a fallback case that tries to emulate a round-to-nearest
>>>> float->int conversion by adding 0.5 and using a truncating fptosi. For odd
>>>> numbers in the range [2^23, 2^24], which are already integral, this has
>>>> the effect of adding 1 to the integer, since the result of adding 0.5 is
>>>> exactly half way between two representable values and the tie goes to even.
>>>> This includes the important special case of (float)0xffffff, which is the
>>>> maximum depth in a z24s8 format. I.e. calling iround on (float)0xffffff
>>>> gives 0x1000000 rather than 0xffffff when the fallback is used.
>>>>
>>>> This patch only uses the x+0.5 trick if the ulp of the input is fractional.
>>>> This still doesn't give ties-to-even semantics, but that doesn't seem to
>>>> matter much in practice.
>>>>
>>>> Signed-off-by: Richard Sandiford <rsandifo at linux.vnet.ibm.com>
>>>> ---
>>>> src/gallium/auxiliary/gallivm/lp_bld_arit.c | 43 ++++++++++++++++++-----------
>>>> 1 file changed, 27 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>>> index 9ef8628..82ddb5a 100644
>>>> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>>> @@ -2181,25 +2181,36 @@ lp_build_iround(struct lp_build_context *bld,
>>>> res = lp_build_round_arch(bld, a, LP_BUILD_ROUND_NEAREST);
>>>> }
>>>> else {
>>>> - LLVMValueRef half;
>>>> + struct lp_type int_type = lp_int_type(type);
>>>> + LLVMTypeRef vec_type = bld->vec_type;
>>>> + unsigned mantissa = lp_mantissa(type);
>>>> + unsigned expbits = type.width - mantissa - 1;
>>>> + unsigned bias = (1 << (expbits - 1)) - 1;
>>>> + LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type,
>>>> + (unsigned long long)1 << (type.width - 1));
>>>> + /* Smallest value with an integral ulp */
>>>> + LLVMValueRef abslimit = lp_build_const_int_vec(bld->gallivm, type,
>>>> + (bias + mantissa) << mantissa);
>>>> + LLVMValueRef half, sign, absa, fractulp;
>>>>
>>>> half = lp_build_const_vec(bld->gallivm, type, 0.5);
>>>>
>>>> - if (type.sign) {
>>>> - LLVMTypeRef vec_type = bld->vec_type;
>>>> - LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type,
>>>> - (unsigned long long)1 << (type.width - 1));
>>>> - LLVMValueRef sign;
>>>> -
>>>> - /* get sign bit */
>>>> - sign = LLVMBuildBitCast(builder, a, int_vec_type, "");
>>>> - sign = LLVMBuildAnd(builder, sign, mask, "");
>>>> -
>>>> - /* sign * 0.5 */
>>>> - half = LLVMBuildBitCast(builder, half, int_vec_type, "");
>>>> - half = LLVMBuildOr(builder, sign, half, "");
>>>> - half = LLVMBuildBitCast(builder, half, vec_type, "");
>>>> - }
>>>> + assert(type.sign);
>>> You can't assert here. It is quite legal to have floats without sign
>>> type - used to skip things like here when we know the input is positive.
>>> There might not be all that many callers using such build contexts,
>>> though some quick glance identified at least two
>>> (lp_build_coord_repeat_npot_linear_int,
>>> lp_build_clamped_float_to_unsigned_norm). Though I would have thought
>>> the latter is where the bug you're seeing is coming from...
>>
>> Ah, sorry, I hadn't realised that sort of modification could happen.
>> I just looked at the types that lp_bld_type.h created.
>>
>>> Also, I'm not a fan of making this more complex in general.
>>> lp_build_itrunc is most often used for converting texture coords, or
>>
>> (lp_build_iround rather than lp_build_itrunc?)
> Yes.
>
>>
>>> other stuff like srgb conversion, and none of these callers care about
>>> this, so maybe should distinguish the cases.
>>
>> As an extra boolean argument? Yeah, I can try that, although I'll probably
>> need help deciding what the right argument is for some cases. Or were you
>> thinking of something else?
> Yes that's what I was thinking. Or otherwise use a different function.
> Both solutions are kind of ugly though admittedly.
Yeah, would prefer to try the other alternative first.
>>>> +
>>>> + /* get sign bit */
>>>> + sign = LLVMBuildBitCast(builder, a, int_vec_type, "");
>>>> + sign = LLVMBuildAnd(builder, sign, mask, "");
>>>> +
>>>> + /* get "ulp is fractional" */
>>>> + absa = lp_build_abs(bld, a);
>>>> + absa = LLVMBuildBitCast(builder, absa, int_vec_type, "");
>>>> + fractulp = lp_build_compare(bld->gallivm, int_type, PIPE_FUNC_LESS, absa, abslimit);
>>>> +
>>>> + /* (sign * 0.5) & intulp */
>>>> + half = LLVMBuildBitCast(builder, half, int_vec_type, "");
>>>> + half = LLVMBuildOr(builder, sign, half, "");
>>>> + half = LLVMBuildAnd(builder, half, fractulp, "");
>>>> + half = LLVMBuildBitCast(builder, half, vec_type, "");
>>>>
>>>> res = LLVMBuildFAdd(builder, a, half, "");
>>>> }
>>>>
>>>
>>> Though if the caller is lp_build_round, it actually seems not too
>>> complicated - the abs() from both will be the same, and the comparison
>>> could be made (nearly) the same too (lp_build_round and friends are
>>> using 2^24, though my comments are wrong there and it could just use
>>> 2^23 instead).
>>> Or would something like adding "largest float smaller than 0.5" first,
>>> then add the remainder to 0.5 later work too? I guess though if you have
>>> type.sign that's not just another "fadd" but also another "or".
>>
>> Hmm, I suppose if we don't care whether x+0.5 goes to x or x+1 (because
>> what we have now isn't ties-to-even anyway) then we could just add
>> "largest float smaller than 0.5" instead of 0.5. I think adding the
>> remainder would only make a difference for an input of 0.5 or -0.5.
> I am not entirely sure if it would work ok, maybe. We've previously
> discovered some of that stuff used when doing float->int tex coord
> conversion
> was very sensitive concerning the values in-between two integral values
> using
> some internal tests (though unsure why exactly as full float precision
> should not be required). Might have been for nearest filtering though
> (which now uses ifloor).
FWIW the attached passed testing on piglit but I realise your concern
was more about other tests.
Thanks,
Richard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-gallivm-Fix-fallback-iround-handling-of-integral-inp.patch
Type: text/x-patch
Size: 3380 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140723/6d38af8d/attachment-0001.bin>
More information about the mesa-dev
mailing list