[Mesa-dev] 16-bit comparisons in NIR

Sat Apr 21 00:16:56 UTC 2018

On Fri, Apr 20, 2018 at 5:16 AM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:

> On 20.04.2018 10:21, Iago Toral wrote:
>
>> Hi,
>>
>> while developing support for Vulkan shaderInt16 on Anvil I came across
>> a feature of NIR that was a bit inconvenient: bools are always 32-bit
>> by design, but the Intel hardware produces 16-bit bool results for 16-
>> bit comparisons, so that creates a problem that manifests like this:
>>
>> vec1 32 ssa_21 = fge ssa_20, ssa_16
>> vec1 16 ssa_22 = b2f ssa_21
>>
>
I was thinking about this a bit this morning and it gets even more sticky.
What happens if you have

bool e = (a < b) && (c < d);

where a and b are 16-bit and c and d are 32-bit?  In this case, one
comprison has a 32-bit value and one has a 16-bit value and you have to
pick one for the &&.

> Our CMP instruction will produce a 16-bit boolean result for the first
>> NIR instruction (where NIR expects it to be 32-bit), so by the time we
>> emit the second instruction in the driver the bit-size for the operand
>> of b2f provided by NIR no longer matches the reality and we emit
>> incorrect code.
>>
>> This seems to have been a consicious design choice in NIR, and while
>> discussing this with Jason he was unsure how much we wanted to change
>> this  or how to do it, given how thoroughly 32-bit bools are baked into
>> NIR and the complexities that modifying this would also bring to our
>> bit-size validation code.
>>
>> I have been considering alternatives that didn't involve changing NIR
>> to support multiple bit-sizes for booleans:
>>
>> 1) Drivers that need to emit smaller booleans could try to fix the
>> generated NIR by correcting the expected bit-sizes for CMP
>> instructions. This would be rather trivial to implement in drivers (and
>> maybe we could even make a generic pass for other drivers to use if
>> they need it) but this will make the validator complain because it
>> won't recognize comparisons with 16-bit bool outputs as valid NIR
>> opcodes. I also found instances where nir_search would complain about
>> mismatching bit-sizes. I haven't looked any further into it yet though,
>> so maybe we can reasonably work around these issues.
>>
>> 2) Drivers could handle this specially when they emit code from NIR.
>> Specifically, when they see a 32-bit boolean source in an instruction,
>> they would have to search for the instruction that produced that source
>> value and check whether it is a 16-bit or a 32-bit boolean to emit
>> proper code for the instruction.
>>
>> 3) Drivers can just convert the 16-bit bool result they generate for
>> 16-bit cmp to the 32-bit bool that NIR expects, and then possibly run
>> an optimization pass to eliminate these extra conversions and fix up
>> the code accordingly.
>>
>
> radeonsi(NIR) and radv already use option 3, since GCN hardware really
> wants to treat bools as 1-bit value, so that's what I'd suggest. The
> optimizations that cleanup the conversions happen in LLVM for us.
>

Is this a GCN thing or an LLVM thing?  It would be neat if your hardware
had 1-bit registers. :-)  We sort-of do but they're special flag registers
and we have very few of them.

--Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180420/77d5b820/attachment.html>