[Mesa-dev] [PATCH 0/6] i965/fs: Integer multiplication improvements
mattst88 at gmail.com
Fri May 15 14:02:03 PDT 2015
This series reworks how we do integer multiplication in the i965/fs
backend and significantly improves code generation for Broadwell's
scalar vertex shaders with NIR by allowing constant propagation into
the MUL instruction (wow that code was stupid, and it still kind of
Before this series, Jason said enabling NIR for scalar vertex shaders
had this result:
total instructions in shared programs: 2724483 -> 2711790 (-0.47%)
instructions in affected programs: 1860859 -> 1848166 (-0.68%)
After this series, the results are:
Broadwell, vertex shaders only, with and without NIR:
total instructions in shared programs: 2742062 -> 2681339 (-2.21%)
instructions in affected programs: 1514770 -> 1454047 (-4.01%)
Along the way, I move when we split integer multiplication into multiple
instructions (on Gen < 8), implement SIMD16 support, and reimplement
integer multiplication on Gen < 8 without using the accumulator.
Patches 3 and 4, add SIMD16 support (for Haswell only, since pre-Gen7
already works, and IVB/BYT have a bug that prevents it from working).
I don't particularly care if those patches go in separately, or if 6/6
is squashed with 4/6, but I think they are instructive -- I'll follow
up with a piglit test that demonstrates that settings the wrong quarter
control indeed generates incorrect results.
I did not touch the imul_high opcode, though it could be lowered in a
similar way (and probably in 3 instructions using the same tricks in 6/6).
That would be a nice follow-on project, and it would mean we'd never see
a MACH instruction during optimizations.
More information about the mesa-dev