[Mesa-dev] [PATCH 2/2] nir/algebraic: Optimize common array indexing sequence

Fri Aug 12 02:01:46 UTC 2016

On Thu, 2016-08-11 at 20:54 +0200, Thomas Helland wrote:
> 2016-08-11 18:19 GMT+02:00 Ian Romanick <idr at freedesktop.org>:
> > 
> > ping
> > 
> > On 07/19/2016 10:37 PM, Ian Romanick wrote:
> > > 
> > > From: Ian Romanick <ian.d.romanick at intel.com>
> > > 
> > > Some shaders include code that looks like:
> > > 
> > >    uniform int i;
> > >    uniform vec4 bones[...];
> > > 
> > >    foo(bones[i * 3], bones[i * 3 + 1], bones[i * 3 + 2]);
> > > 
> > > CSE would do some work on this:
> > > 
> > >    x = i * 3
> > >    foo(bones[x], bones[x + 1], bones[x + 2]);
> > > 
> > > The compiler may then add '<< 4 + base' to the index
> > > calculations.
> > > This results in expressions like
> > > 
> > >    x = i * 3
> > >    foo(bones[x << 4], bones[(x + 1) << 4], bones[(x + 2) << 4]);
> > > 
> 
> This may just be me being dense, but why is the compiler adding the
> shift?
> It seems like this would be "un-optimizing" and so it shouldn't be
> added?
> 
> This patch is also a good argument for adding a constant
> reassociation pass.
> At least the (a * #b) << #c    --->    a * (#b << #c) would be caught
> by that.
> 
> I was confused whether the optimization was preserving signed-ness
> correctly
> but after a couple rounds with myself I believe it is correct.
> 
> The change looks good and is:
> 
> Reviewed-by: Thomas Helland <thomashelland90 at gmail.com>
> 
> (Oh, and yes, I'm back to hobby mesa development after finally
> securing a new job and a new apartment)

Congrats and welcome back. 

I did send you an email but just in case you missed it I've picked up
on your loop unrolling work.

> 
> > 
> > > 
> > > Just rearranging the math to produce (i * 48) + 16 saves an
> > > instruction, and it allows CSE to do more work.
> > > 
> > >    x = i * 48;
> > >    foo(bones[x], bones[x + 16], bones[x + 32]);
> > > 
> > > So, ~6 instructions becomes ~3.
> > > 
> > > Some individual shader-db results look pretty bad.  However, I
> > > have a
> > > really, really hard time believing the change in estimated cycles
> > > in,
> > > for example, 3dmmes-taiji/51.shader_test after looking that
> > > change in
> > > the generated code.
> > > 
> > > G45
> > > total instructions in shared programs: 4020840 -> 4010070 (-
> > > 0.27%)
> > > instructions in affected programs: 177460 -> 166690 (-6.07%)
> > > helped: 894
> > > HURT: 0
> > > 
> > > total cycles in shared programs: 98829000 -> 98784990 (-0.04%)
> > > cycles in affected programs: 3936648 -> 3892638 (-1.12%)
> > > helped: 894
> > > HURT: 0
> > > 
> > > Ironlake
> > > total instructions in shared programs: 6418887 -> 6408117 (-
> > > 0.17%)
> > > instructions in affected programs: 177460 -> 166690 (-6.07%)
> > > helped: 894
> > > HURT: 0
> > > 
> > > total cycles in shared programs: 143504542 -> 143460532 (-0.03%)
> > > cycles in affected programs: 3936648 -> 3892638 (-1.12%)
> > > helped: 894
> > > HURT: 0
> > > 
> > > Sandy Bridge
> > > total instructions in shared programs: 8357887 -> 8339251 (-
> > > 0.22%)
> > > instructions in affected programs: 432715 -> 414079 (-4.31%)
> > > helped: 2795
> > > HURT: 0
> > > 
> > > total cycles in shared programs: 118284184 -> 118207412 (-0.06%)
> > > cycles in affected programs: 6114626 -> 6037854 (-1.26%)
> > > helped: 2478
> > > HURT: 317
> > > 
> > > Ivy Bridge
> > > total instructions in shared programs: 7669390 -> 7653822 (-
> > > 0.20%)
> > > instructions in affected programs: 388234 -> 372666 (-4.01%)
> > > helped: 2795
> > > HURT: 0
> > > 
> > > total cycles in shared programs: 68381982 -> 68263684 (-0.17%)
> > > cycles in affected programs: 1972658 -> 1854360 (-6.00%)
> > > helped: 2458
> > > HURT: 307
> > > 
> > > Haswell
> > > total instructions in shared programs: 7082636 -> 7067068 (-
> > > 0.22%)
> > > instructions in affected programs: 388234 -> 372666 (-4.01%)
> > > helped: 2795
> > > HURT: 0
> > > 
> > > total cycles in shared programs: 68282020 -> 68164158 (-0.17%)
> > > cycles in affected programs: 1891820 -> 1773958 (-6.23%)
> > > helped: 2459
> > > HURT: 261
> > > 
> > > Broadwell
> > > total instructions in shared programs: 9002466 -> 8985875 (-
> > > 0.18%)
> > > instructions in affected programs: 658784 -> 642193 (-2.52%)
> > > helped: 2795
> > > HURT: 5
> > > 
> > > total cycles in shared programs: 78503092 -> 78450404 (-0.07%)
> > > cycles in affected programs: 2873304 -> 2820616 (-1.83%)
> > > helped: 2275
> > > HURT: 415
> > > 
> > > Skylake
> > > total instructions in shared programs: 9156978 -> 9140387 (-
> > > 0.18%)
> > > instructions in affected programs: 682625 -> 666034 (-2.43%)
> > > helped: 2795
> > > HURT: 5
> > > 
> > > total cycles in shared programs: 75591392 -> 75550574 (-0.05%)
> > > cycles in affected programs: 3192120 -> 3151302 (-1.28%)
> > > helped: 2271
> > > HURT: 425
> > > 
> > > Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
> > > ---
> > >  src/compiler/nir/nir_opt_algebraic.py | 11 +++++++++++
> > >  1 file changed, 11 insertions(+)
> > > 
> > > diff --git a/src/compiler/nir/nir_opt_algebraic.py
> > > b/src/compiler/nir/nir_opt_algebraic.py
> > > index 0f0896b..37cb700 100644
> > > --- a/src/compiler/nir/nir_opt_algebraic.py
> > > +++ b/src/compiler/nir/nir_opt_algebraic.py
> > > @@ -119,6 +119,17 @@ optimizations = [
> > >     (('~fadd at 64', a, ('fmul',         c , ('fadd', b, ('fneg',
> > > a)))), ('flrp', a, b, c), '!options->lower_flrp64'),
> > >     (('ffma', a, b, c), ('fadd', ('fmul', a, b), c), 'options-
> > > >lower_ffma'),
> > >     (('~fadd', ('fmul', a, b), c), ('ffma', a, b, c), 'options-
> > > >fuse_ffma'),
> > > +
> > > +   # (a * #b + #c) << #d
> > > +   # ((a * #b) << #d) + (#c << #d)
> > > +   # (a * (#b << #d)) + (#c << #d)
> > > +   (('ishl', ('iadd', ('imul', a, '#b'), '#c'), '#d'),
> > > +    ('iadd', ('imul', a, ('ishl', b, d)), ('ishl', c, d))),
> > > +
> > > +   # (a * #b) << #c
> > > +   # a * (#b << #c)
> > > +   (('ishl', ('imul', a, '#b'), '#c'), ('imul', a, ('ishl', b,
> > > c))),
> > > +
> > >     # Comparison simplifications
> > >     (('~inot', ('flt', a, b)), ('fge', a, b)),
> > >     (('~inot', ('fge', a, b)), ('flt', a, b)),
> > > 
> > 
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev