[Mesa-dev] [PATCH 2/3] ptn: Fix all users of ptn_swizzle

Ian Romanick idr at freedesktop.org
Wed Mar 30 23:23:31 UTC 2016


From: Ian Romanick <ian.d.romanick at intel.com>

None of the callers actually wanted what it did.  In ptn_xpd, you only
ever want a vec3 swizzle.  In ptn_tex, you want a swizzle that matches
the number of required texture coordinates.

shader-db results:

G45:
total instructions in shared programs: 4011240 -> 4010911 (-0.01%)
instructions in affected programs: 59232 -> 58903 (-0.56%)
helped: 114
HURT: 0

total cycles in shared programs: 84314194 -> 84313220 (-0.00%)
cycles in affected programs: 779150 -> 778176 (-0.13%)
helped: 110
HURT: 13

Ironlake:
total instructions in shared programs: 6397262 -> 6396605 (-0.01%)
instructions in affected programs: 117402 -> 116745 (-0.56%)
helped: 227
HURT: 0

total cycles in shared programs: 128889798 -> 128888524 (-0.00%)
cycles in affected programs: 1214644 -> 1213370 (-0.10%)
helped: 179
HURT: 44

Sandy Bridge:
total instructions in shared programs: 8467391 -> 8467384 (-0.00%)
instructions in affected programs: 3107 -> 3100 (-0.23%)
helped: 10
HURT: 6

total cycles in shared programs: 117580120 -> 117573448 (-0.01%)
cycles in affected programs: 103158 -> 96486 (-6.47%)
helped: 84
HURT: 11

Ivy Bridge:
total instructions in shared programs: 7774255 -> 7774258 (0.00%)
instructions in affected programs: 1677 -> 1680 (0.18%)
helped: 8
HURT: 6

total cycles in shared programs: 65743828 -> 65739190 (-0.01%)
cycles in affected programs: 89312 -> 84674 (-5.19%)
helped: 78
HURT: 23

Haswell:
total instructions in shared programs: 7107172 -> 7107150 (-0.00%)
instructions in affected programs: 2048 -> 2026 (-1.07%)
helped: 16
HURT: 0

total cycles in shared programs: 64653636 -> 64647486 (-0.01%)
cycles in affected programs: 86836 -> 80686 (-7.08%)
helped: 85
HURT: 17

Broadwell and Skylake:
total instructions in shared programs: 8447529 -> 8447507 (-0.00%)
instructions in affected programs: 2038 -> 2016 (-1.08%)
helped: 16
HURT: 0

total cycles in shared programs: 66418670 -> 66413416 (-0.01%)
cycles in affected programs: 90110 -> 84856 (-5.83%)
helped: 83
HURT: 20

Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
---

Here's the rub with this patch... some shaders have instruction counts
changed (some decrease, some increase) due to a cascade of effects.
First, by not having some components "used" as TEX sources, some
multiply results are only used once.  These allows the multiplies to be
fused with later adds to form ffmas.

The added MADs generated in the i965 backend can either drastically
improve things or drastically damage things.  Below are two results.
Shader names removed to protect the innocent.

cycles helped: shaders/closed/game-A/fp-794.shader_test FS SIMD16: 830 -> 646 (-22.17%)
cycles HURT:   shaders/closed/game-B/fp-215.shader_test FS SIMD16: 2098 -> 3094 (47.47%)

Neither of these shaders had their instruction counts change, but the
number MADs (which are doubled-up in SIMD16) change.  With the added
MADs, the scheduler chooses a really, really bad ordering for the TEX
instructions.  Since these are ARB_fragment_program shaders, there are
no loops.  I believe the relative change in the cycle counts.

 src/mesa/program/prog_to_nir.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/program/prog_to_nir.c b/src/mesa/program/prog_to_nir.c
index ce25f6d..a6119ae 100644
--- a/src/mesa/program/prog_to_nir.c
+++ b/src/mesa/program/prog_to_nir.c
@@ -59,7 +59,6 @@ struct ptn_compile {
 
 #define SWIZ(X, Y, Z, W) \
    (unsigned[4]){ SWIZZLE_##X, SWIZZLE_##Y, SWIZZLE_##Z, SWIZZLE_##W }
-#define ptn_swizzle(b, src, x, y, z, w) nir_swizzle(b, src, SWIZ(x, y, z, w), 4, true)
 #define ptn_channel(b, src, ch) nir_swizzle(b, src, SWIZ(ch, ch, ch, ch), 1, true)
 
 static nir_ssa_def *
@@ -491,11 +490,11 @@ ptn_xpd(nir_builder *b, nir_alu_dest dest, nir_ssa_def **src)
    ptn_move_dest_masked(b, dest,
                         nir_fsub(b,
                                  nir_fmul(b,
-                                          ptn_swizzle(b, src[0], Y, Z, X, X),
-                                          ptn_swizzle(b, src[1], Z, X, Y, X)),
+                                          nir_swizzle(b, src[0], SWIZ(Y, Z, X, W), 3, true),
+                                          nir_swizzle(b, src[1], SWIZ(Z, X, Y, W), 3, true)),
                                  nir_fmul(b,
-                                          ptn_swizzle(b, src[1], Y, Z, X, X),
-                                          ptn_swizzle(b, src[0], Z, X, Y, X))),
+                                          nir_swizzle(b, src[1], SWIZ(Y, Z, X, W), 3, true),
+                                          nir_swizzle(b, src[0], SWIZ(Z, X, Y, W), 3, true))),
                         WRITEMASK_XYZ);
    ptn_move_dest_masked(b, dest, nir_imm_float(b, 1.0), WRITEMASK_W);
 }
@@ -642,7 +641,8 @@ ptn_tex(nir_builder *b, nir_alu_dest dest, nir_ssa_def **src,
    unsigned src_number = 0;
 
    instr->src[src_number].src =
-      nir_src_for_ssa(ptn_swizzle(b, src[0], X, Y, Z, W));
+      nir_src_for_ssa(nir_swizzle(b, src[0], SWIZ(X, Y, Z, W),
+                                  instr->coord_components, true));
    instr->src[src_number].src_type = nir_tex_src_coord;
    src_number++;
 
-- 
2.5.5



More information about the mesa-dev mailing list