[Mesa-dev] [PATCH 0/3] i965: Delete all of the non-NIR vec4 code

Mon Sep 21 17:45:20 PDT 2015

On Mon, Sep 21, 2015 at 3:18 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> At this point, piglit is the same as for GLSL and the shader-db numbers are
> looking pretty good.  On SNB, GLSL vs. NIR for vec4 programs is:
>
>    total instructions in shared programs: 2020573 -> 1822601 (-9.80%)
>    instructions in affected programs:     1883334 -> 1685362 (-10.51%)
>    helped:                                13328
>    HURT:                                  3594
>
> and there are patches on the list that improve this to
>
>    total instructions in shared programs: 2020283 -> 1805487 (-10.63%)
>    instructions in affected programs:     1855759 -> 1640963 (-11.57%)
>    helped:                                14142
>    HURT:                                  2346

Wow, that's great. I didn't realize we were that close.

That said, I don't feel like we're /quite/ ready for this (especially
with outstanding optimization patches on the list). I'm not sure what
patches are pending.

Some things I've seen in digging through hurt programs today:

portal-2/high/5134 emits:

        vec1 ssa_53 = flog2 ssa_52
        vec1 ssa_54 = flog2 ssa_52.y
        vec1 ssa_55 = flog2 ssa_52.z
        vec4 ssa_56 = vec4 ssa_53, ssa_54, ssa_55, ssa_42.w
        vec3 ssa_57 = fmul ssa_56, ssa_3
        vec1 ssa_58 = fexp2 ssa_57
        vec1 ssa_59 = fexp2 ssa_57.y
        vec1 ssa_60 = fexp2 ssa_57.z
        vec4 ssa_61 = vec4 ssa_58, ssa_59, ssa_60, ssa_42.w

which we didn't transform into a vec3 pow with or without NIR but we
really should. Why isn't NIR able to handle this? (also, why isn't
".x" printed when the use of an ssa value scalar, e.g., in the
assignment of ssa_58 the RHS should use ssa_57.x).

We generate worse code for all_equal/any_nequal/any.

book-of-unwritten-tales/original/vp-33 (a vertex program) emits uses
DPH and NIR doesn't have DPH. NIR should probably grow a DPH
instruction even if we don't have an optimization to recognize
open-coded DPH.

Lots of things hurt because of lack of global copy/constant
propagation. I think NIR often emits the constant loads in blocks
earlier than their uses and the backend optimizations aren't able to
cope. See team-fortress-2/2197 for example (search for 953267991D, the
hex value for 0.0001F).

I remember this issue from the FS/NIR backend as well, but dota-2/504
(and others) emit:

mad(8)  g16<1>.xF  g11<4,4,1>.xF  g12<4,4,1>.xF  g2<4,4,1>.xF
mad(8)  g19<1>.xF  g10<4,4,1>.xF  g12<4,4,1>.xF  g2<4,4,1>.xF
mad(8)  g22<1>.xF  g9<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
mad(8)  g25<1>.xF  g8<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
mad(8)  g28<1>.xF  g7<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
mad(8)  g31<1>.xF  g6<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF

where the multiplication is duplicated. I can't remember what we decided.

Some instruction sequences are improved with MAD, some are hurt. I
have seen Eduardo's patch and will have to think some more about it.
As far as I can tell, this one has affects the most shaders.

A good number of dashboard2 shaders are hurt significantly (+20%), but
it's not obvious to me why. I'll keep looking.