[Mesa-dev] nir: find_msb vs clz

Tue Apr 7 19:21:23 UTC 2020

On 4/1/20 11:52 AM, Eric Anholt wrote:
> I would generally be of the opinion that we should have NIR opcodes
> that match any common hardware instructions, and lowering in algebraic
> to help turn input patterns into clean sequences of hardware
> instructions.

There is quite a bit of benefit to having a single canonical
representation of things in the IR.  Whenever there are multiple ways of
doing the same thing, various passes need to be aware and handle all of
them.  I have two concrete examples.

In NIR there is a fsub(x, y) instruction, but we very quickly convert
that to fadd(fneg(x), y).  If we didn't, every pattern in opt_algebraic
that handles fadd would also need a variant for fsub.  If a pattern had
four instances of fadd, it would need 16 variants.

In NIR there is pack_half_2x16 and pack_half_2x16_split.  I just noticed
the other day that 1f72857739be added some optimization patterns for one
but not the other.  I'll have an MR soon that adds them.

It seems like most of the time when there are architecture specific
details creep into NIR instructions, it is done to overcome deficiencies
in the backend IR.  I know that I have done this, and I don't think it's
a problem per se.  However, care should be taken.  I have tried to do
most of these kinds of lowering during much later optimization passes,
for example, to prevent the need for a combinatorial explosion in the
number of patterns in the main block of algebraic optimizations.

There definitely are problems with having a billion patterns in
opt_algebraic.  See !3765 for some discussion on this topic.  Also, as
the number of patterns increases, the size of the state transition
tables increases quadratically.  I suspect we're going to want / need to
refactor the single, giant table of algebraic optimizations in the not
too distant future.

Munchnick has this idea of "levels" of IR.  The IR itself (data
structures) is the same, but the set of allowable constructs changes as
the program proceeds through the phases of compilation.  We have some of
that now with source modifiers and 1-bit vs. 32-bit Booleans.  What we
lack is a way for passes to advertise what "levels" they support or to
enforce what features exist at a given time.  I don't know that we need
something that rigid, but right now you just have to know that kinds of
instructions should be able to exist at different points during
compilation.  It's easy to make mistakes, and it's difficult to detect
some classes of those mistakes.

I'm having a conversation with Rhys about this topic in !3151 right now.