[Mesa-dev] [PATCH v2] format_utils: Use a more precise conversion when decreasing bits

Fri Jan 16 05:47:09 PST 2015

Interesting.  I wonder if I leave the division by 255 LLVM will do 
similar magic.  The risk in llvmpipe is most of these operations are 
with i8x16 and i16x8 vector types which don't even have division in x86.

Jose

On 16/01/15 13:40, Neil Roberts wrote:
> Ah that's neat, thanks.
>
> If I compile the existing patch with gcc -O3 it seems to optimize the
> division into shifts anyway like this (for 8-bit to 5-bit):
>
>     0:   89 f8                   mov    %edi,%eax
>     2:   ba 81 80 80 80          mov    $0x80808081,%edx
>     7:   c1 e0 05                shl    $0x5,%eax
>     a:   29 f8                   sub    %edi,%eax
>     c:   83 c0 7f                add    $0x7f,%eax
>     f:   48 0f af c2             imul   %rdx,%rax
>    13:   48 c1 e8 27             shr    $0x27,%rax
>    17:   c3                      retq
>
> It doesn't look quite the same but presumably gcc knows what it's doing.
> In that case maybe it would be better to leave the code as it is so that
> it's a bit easier to understand.
>
> - Neil
>
> Jose Fonseca <jfonseca at vmware.com> writes:
>
>> Yes, the bit shifting can be a crude approximation.
>>
>> llvmpipe did that everywhere but we had to fix it in a few places, e.g.:
>>
>>    https://urldefense.proofpoint.com/v2/url?u=http-3A__cgit.freedesktop.org_mesa_mesa_commit_-3Fid-3D1569b3e536da9337a28a16d0cc6ed07043bf094b&d=AwIBAg&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=grjTHlglPFfDI7S-Aw4ngv8Jh31PPevo3ffXc4sjFdQ&s=pHWTInMmX03nvm0BrDtWk8GH_vym9IyQaBF0kjQf16U&e=
>>
>> The multiplication is unavoidable (*), but one can avoid the division of (2*n-1) by doing a series of left shifts by n:
>>
>>    x/(2*n - 1) = (x >> n) + (x >> 2*n) + ...
>>
>> See also
>>
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__cgit.freedesktop.org_mesa_mesa_tree_src_gallium_auxiliary_gallivm_lp-5Fbld-5Farit.c-23n851&d=AwIBAg&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=grjTHlglPFfDI7S-Aw4ngv8Jh31PPevo3ffXc4sjFdQ&s=d8m4KwGQlG1eHv0boBFXdGvVC18W3PwNHwsCCmlhf-Q&e=
>>
>> Jose
>>
>> (*) it can be expanded as shifts too, but it wouldn't be worthwhile
>>
>>
>> ________________________________________
>> From: mesa-dev <mesa-dev-bounces at lists.freedesktop.org> on behalf of Neil Roberts <neil at linux.intel.com>
>> Sent: 15 January 2015 16:52
>> To: Jason Ekstrand
>> Cc: mesa-dev at lists.freedesktop.org
>> Subject: Re: [Mesa-dev] [PATCH v2] format_utils: Use a more precise     conversion when decreasing bits
>>
>> Jason Ekstrand <jason at jlekstrand.net> writes:
>>
>>> This looks fine to me.  We should probably also do this for snorm formats.
>>> I don't care if that's part of this or in a separate patch.
>>> --Jason
>>
>> The snorm formats are a bit more fiddly because the hardware doesn't
>> quite seem to be doing what I'd expect. For example, when converting
>> from 16 bits to 8 bits we effectively want to convert from the range
>> [-32767,+32767] to [-127,+127]. I think that means we can just use a
>> formula like this:
>>
>>   a = 127 * b / 32767
>>
>> This almost works except when the result is close to halfway between two
>> integers the hardware seems to round in the wrong direction. For example
>> the input value 16642 according to the formula should be 64.5019, which
>> you'd hope it would round to 65. However it rounds to 64.
>>
>> We could probably bodge the formula to match what Intel hardware does
>> but that seems a bit cheeky considering this is in generic Mesa code. I
>> tried to check what NVidia is doing on my old GeForce 9400 with the
>> binary driver and it seems to be more or less doing the formula but
>> truncating the fractional part instead of rounding. That of course would
>> end up with completely different results again so we wouldn't be able to
>> match both ways of doing it.
>>
>> I made the patch below which seems to correctly implement the formula at
>> least for all values of converting 16-bit to 8-bit. Maybe we could just use
>> it and accept that it doesn't match the hardware in a few cases. I don't
>> think it will affect any Piglit tests which was the original incentive
>> to try and fix this.
>>
>> diff --git a/src/mesa/main/format_utils.h b/src/mesa/main/format_utils.h
>> index 2faaf76..c959b57 100644
>> --- a/src/mesa/main/format_utils.h
>> +++ b/src/mesa/main/format_utils.h
>> @@ -150,8 +150,19 @@ _mesa_snorm_to_snorm(int x, unsigned src_bits, unsigned dst_bits)
>>         return -MAX_INT(dst_bits);
>>      else if (src_bits < dst_bits)
>>         return EXTEND_NORMALIZED_INT(x, src_bits - 1, dst_bits - 1);
>> -   else
>> -      return x >> (src_bits - dst_bits);
>> +   else {
>> +      /* As an example, when converting from 16 bits to 8 bits, when it gets
>> +       * here x will be in the range [-32767,+32767] and we want to map this
>> +       * to [-127,+127]. The -32768 case will have been handled in the first
>> +       * if-clause above. This just adds 32767 so that it is in the range
>> +       * [0,65534], multiplies it by 254 and then divides by 65534. The
>> +       * division is done with only positive numbers because dividing a
>> +       * negative number has undefined rounding in C < 99. After the division
>> +       * it is subtracted by 127 to put in back in the range [-127,127]. */
>> +      return (((x + MAX_INT(src_bits)) * (MAX_INT(dst_bits) * 2) +
>> +               MAX_INT(src_bits)) /
>> +              (MAX_INT(src_bits) * 2) - MAX_INT(dst_bits));
>> +   }
>>   }
>>
>> - Neil
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=vplJMJne1jDNgTuUaoOMyAcSc_2M_htemU1Ape0OnZU&s=QvYoAmyRrXMhcmXGw4RLTWpfwyRxN7TL3B2CMbf3foY&e=