[Mesa-dev] [PATCH 9/9] i965: Use NIR by default for Fragment shaders on SNB+

Matt Turner mattst88 at gmail.com
Tue Mar 17 22:56:13 PDT 2015


Thanks. Looked through stats and at some of the regressions.

Some of the areas I noticed we were doing worse:

We generate two CMPs for discard_if; only one without NIR. I think you
had an idea about solving this.

SEL peephole interference -- we knew about this in general, but I
found an interesting failure mode:

NIR generates

(+f0) if(8)     JIP: 4          UIP: 5
else(8)         JIP: 3
mov(8)          g124<1>F        -g124<8,8,1>F
endif(8)        JIP: 6

We should of course improve dead-control-flow-elimination to remove
the else and set the if's predicate_inverse, but non-NIR generates

(+f0) sel(8)    g124<1>F        g25<8,8,1>F     -g25<8,8,1>F

which is probably better even than a predicated MOV since it's not a
partial write and subject to optimization restrictions. The assembly
matches the NIR exactly, so I'm assuming NIR is removing a
self-assignment in the if-then block, preventing it from turning it
into a conditional select.

Also, since the fs's SEL peephole only recognizes MOVs to the same
destination in order, it doesn't handle tropico-5/387 properly, which
already improved from 110 to 79 instructions.

I also noticed an interesting pattern in defense-grid/537 (and some
others, I think). NIR generates

mul(8)  g4<1>F    g2<8,8,1>F  g3<8,8,1>F
mul(8)  g6<1>F    g5<8,8,1>F  g4<8,8,1>F
mad(8)  g127<1>F  g6<4,4,1>F  g3<4,4,1>F  g2<4,4,1>F

whereas without NIR we get:

mul(8)  g5<1>F    g2<8,8,1>F  g4<8,8,1>F
mad(8)  g127<1>F  g5<4,4,1>F  g3<4,4,1>F  g5<4,4,1>F

Again, the assembly matches the NIR, so NIR is doing something bad.

The changes to sun-temple/209 (and a lot of other sun temple shaders)
look wrong. I'm not sure if the translation from NIR is doing
something bad or what, but we should have 8 texture fetches and we
only end up with 2.


Some highlights --

Some shaders are improved because NIR recognizes exp2(log(x) * y) as
pow(x, y) when the GLSL optimizations couldn't, because its dead code
elimination is much too stupid to allow tree grafting to put the IR in
the form we expect. This accounts for a low of the high percentage
gains.

It cuts some Kerbal Space Program shaders in half (~1000 -> ~500
instructions) and removes ~100 spills and ~100 fills from each. Not
sure how it actually accomplished this. Too hard to read the assembly
with all of the spills.

Code generated by tropico-5/252 is really stupid. We MOVs both before
and in an if statement that do the same thing. NIR cleans that up
easily as a side-effect of SSA.

In strike-suit-zero/156 we emit more MADs that we couldn't recognize
from the GLSL IR, since again, its DCE is terrible and we can't
combine trees.

In the-binding-of-isaac-rebirth/18, it's able to CSE some duplicate
rcp expressions in different basic blocks. I'm not sure why the
backend isn't doing this.


At this point I've looked through all of the shaders helped by >15% or more.


More information about the mesa-dev mailing list