[Mesa-dev] [PATCH 9/9] i965: Use NIR by default for Fragment shaders on SNB+
Jason Ekstrand
jason at jlekstrand.net
Wed Mar 18 08:27:50 PDT 2015
On Wed, Mar 18, 2015 at 7:22 AM, Connor Abbott <cwabbott0 at gmail.com> wrote:
> On Wed, Mar 18, 2015 at 1:56 AM, Matt Turner <mattst88 at gmail.com> wrote:
>> Thanks. Looked through stats and at some of the regressions.
>>
>> Some of the areas I noticed we were doing worse:
>>
>> We generate two CMPs for discard_if; only one without NIR. I think you
>> had an idea about solving this.
>>
>> SEL peephole interference -- we knew about this in general, but I
>> found an interesting failure mode:
>>
>> NIR generates
>>
>> (+f0) if(8) JIP: 4 UIP: 5
>> else(8) JIP: 3
>> mov(8) g124<1>F -g124<8,8,1>F
>> endif(8) JIP: 6
>>
>> We should of course improve dead-control-flow-elimination to remove
>> the else and set the if's predicate_inverse, but non-NIR generates
>>
>> (+f0) sel(8) g124<1>F g25<8,8,1>F -g25<8,8,1>F
>>
>> which is probably better even than a predicated MOV since it's not a
>> partial write and subject to optimization restrictions. The assembly
>> matches the NIR exactly, so I'm assuming NIR is removing a
>> self-assignment in the if-then block, preventing it from turning it
>> into a conditional select.
>
> I think this is because the sel peephole (along with all the other
> optimizations) in NIR only works before we lower to source mods, where
> this code probably looked something like:
>
> ssa_1 = ...;
> if ... {
> } else {
> ssa_2 = fneg ssa_1;
> }
> ssa_3 = phi ssa_1, ssa_2;
More specifically, we probably got
b = foo ? a : -a
from the GLSL source which NIR turns into
if (foo) {
ssa_1 = a;
} else {
ssa_2 = -a;
}
ssa_3 = phi(ssa_1, ssa_2)
which gets copy-propagated to what Connor mentioned above.
> So we can probably fix it just by adding fneg/ineg to
> block_check_for_allowed_instrs().
That was my first thought but we're starting to get into the world of
backend-specific knowledge there. Maybe this is where we just need to
pass in Eric's backend info struct and predicate doing so on whether
or not neg is free? But I guess the only compiler we care about at
the moment where that isn't true is vc4 which can't do control-flow
right now anyway. Thoughts?
We may also want to simply make the backend's peephole smarter.
>>
>> Also, since the fs's SEL peephole only recognizes MOVs to the same
>> destination in order, it doesn't handle tropico-5/387 properly, which
>> already improved from 110 to 79 instructions.
>>
>> I also noticed an interesting pattern in defense-grid/537 (and some
>> others, I think). NIR generates
>>
>> mul(8) g4<1>F g2<8,8,1>F g3<8,8,1>F
>> mul(8) g6<1>F g5<8,8,1>F g4<8,8,1>F
>> mad(8) g127<1>F g6<4,4,1>F g3<4,4,1>F g2<4,4,1>F
>>
>> whereas without NIR we get:
>>
>> mul(8) g5<1>F g2<8,8,1>F g4<8,8,1>F
>> mad(8) g127<1>F g5<4,4,1>F g3<4,4,1>F g5<4,4,1>F
>>
>> Again, the assembly matches the NIR, so NIR is doing something bad.
>>
>> The changes to sun-temple/209 (and a lot of other sun temple shaders)
>> look wrong. I'm not sure if the translation from NIR is doing
>> something bad or what, but we should have 8 texture fetches and we
>> only end up with 2.
>>
>>
>> Some highlights --
>>
>> Some shaders are improved because NIR recognizes exp2(log(x) * y) as
>> pow(x, y) when the GLSL optimizations couldn't, because its dead code
>> elimination is much too stupid to allow tree grafting to put the IR in
>> the form we expect. This accounts for a low of the high percentage
>> gains.
>>
>> It cuts some Kerbal Space Program shaders in half (~1000 -> ~500
>> instructions) and removes ~100 spills and ~100 fills from each. Not
>> sure how it actually accomplished this. Too hard to read the assembly
>> with all of the spills.
>>
>> Code generated by tropico-5/252 is really stupid. We MOVs both before
>> and in an if statement that do the same thing. NIR cleans that up
>> easily as a side-effect of SSA.
>>
>> In strike-suit-zero/156 we emit more MADs that we couldn't recognize
>> from the GLSL IR, since again, its DCE is terrible and we can't
>> combine trees.
>>
>> In the-binding-of-isaac-rebirth/18, it's able to CSE some duplicate
>> rcp expressions in different basic blocks. I'm not sure why the
>> backend isn't doing this.
>>
>>
>> At this point I've looked through all of the shaders helped by >15% or more.
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list