[Mesa-dev] [PATCH 5/5] nir: Recognize open-coded bitfield_reverse.

Matt Turner mattst88 at gmail.com
Mon Feb 8 23:36:48 UTC 2016


On Mon, Feb 8, 2016 at 12:04 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> On Thu, Feb 4, 2016 at 5:48 PM, Matt Turner <mattst88 at gmail.com> wrote:
>>
>> Helps 11 shaders in UnrealEngine4 demos.
>>
>> I seriously hope they would have given us bitfieldReverse() if we
>> exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that
>> anyway?).
>>
>> instructions in affected programs: 4875 -> 4633 (-4.96%)
>> cycles in affected programs: 270516 -> 244516 (-9.61%)
>>
>> I suspect there's a *lot* of room to improve nir_search/opt_algebraic's
>> handling of this. We'd actually like to match, e.g., step2 by matching
>> step1 once and then doing a pointer comparison for the second instance
>> of step1, but unfortunately we generate an enormous tuple for instead.
>>
>> The .text size increases by 6.5% and the .data by 17.5%.
>>
>>    text     data  bss    dec    hex  filename
>>   22957    45224    0  68181  10a55  nir_libnir_la-nir_opt_algebraic.o
>>   24461    53160    0  77621  12f35  nir_libnir_la-nir_opt_algebraic.o
>>
>> I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is
>> in a GL 4.0 context once we expose GL 4.0.
>> ---
>> Maybe it'd be better do make this a separate pass capable of recognizing
>> this
>> pattern without blowing up the compiled code size. Probably worth checking
>> whether they use bitfieldReverse() under GL 4.0 first...
>>
>>  src/compiler/nir/nir_opt_algebraic.py | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/src/compiler/nir/nir_opt_algebraic.py
>> b/src/compiler/nir/nir_opt_algebraic.py
>> index 0a248a2..f92c6b9 100644
>> --- a/src/compiler/nir/nir_opt_algebraic.py
>> +++ b/src/compiler/nir/nir_opt_algebraic.py
>> @@ -311,6 +311,18 @@ optimizations = [
>>       'options->lower_unpack_snorm_4x8'),
>>  ]
>>
>> +def bitfield_reverse(u):
>> +    step1 = ('ior', ('ishl', u, 16), ('ushr', u, 16))
>> +    step2 = ('ior', ('ishl', ('iand', step1, 0x00ff00ff), 8), ('ushr',
>> ('iand', step1, 0xff00ff00), 8))
>> +    step3 = ('ior', ('ishl', ('iand', step2, 0x0f0f0f0f), 4), ('ushr',
>> ('iand', step2, 0xf0f0f0f0), 4))
>> +    step4 = ('ior', ('ishl', ('iand', step3, 0x33333333), 2), ('ushr',
>> ('iand', step3, 0xcccccccc), 2))
>> +    step5 = ('ior', ('ishl', ('iand', step4, 0x55555555), 1), ('ushr',
>> ('iand', step4, 0xaaaaaaaa), 1))
>> +
>> +    return step5
>
>
> Mind calling this "ue4_bitfield_reverse"?  You're not detecting a generic
> bitfield reverse here.  With that, patches 1, 3, and 5 are

Well, none of these optimizations are recognizing generic anything.
What would a "generic bitfield reverse" be? Any sequence of
instructions that performs a bitfield reverse? That sounds hard :)

I find putting a reference to an application in the function name
somewhat distasteful. I'm happy to put a comment above the function
saying that his is the pattern that Unreal Engine 4 uses. We'll figure
out how to rename the function in the unfortunate event that we find
another, different, open-coded bitfield_reverse. :)

> Reviewed-by: Jason Ekstrand <jason.ekstrand at intel.com>

Thanks!

(And thanks to Dylan for helping me figure out the Python in 4/5)


More information about the mesa-dev mailing list