[Mesa-dev] nir: find_msb vs clz

Wed Apr 1 18:47:56 UTC 2020

On Wed, Apr 1, 2020 at 2:39 PM Erik Faye-Lund
<erik.faye-lund at collabora.com> wrote:
>
> While working on the NIR to DXIL conversion code for D3D12, I've
> noticed that we're not exactly doing the best we could here.
>
> First some background:
>
> NIR currently has a few instructions that does kinda the same:
>
> 1. nir_op_ufind_msb: Finds the index of the most significant bit,
> counting from the least significant bit. It returns -1 on zero-input.
>
> 2. nir_op_ifind_msb: A signed version of ufind_msb; looks for the first
> non sign-bit. It's not terribly interesting in this context, as it can
> be trivially lowered if missing, and it doesn't seem like any hardware
> supports this natively. I'm just mentioning it for completeness.

While I can't speak to the current state of the nouveau NIR backend,
the hardware definitely has both of these. (And a cursory look
indicates that both are properly supported without any unnecessary
lowering.) It's known as "FLO" in the NVIDIA intrinsic names, and it's
the "BFIND" instruction in nv50_ir-speak. It's present on all Fermi+
GPUs.

https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp#n2346
https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp#n946

Note that it has a separate bit for whether it's the signed variant or
not. There's also a "SAMT" variant of it, but I honestly don't
remember what that does exactly. We use it when finding the LSB after
reversing the bits. I think makes the op return 32-x or something.

Cheers,

  -ilia