[Mesa-dev] [PATCH 2/9 v2] intel/compiler: More peephole select

Ian Romanick idr at freedesktop.org
Thu Aug 30 05:35:41 UTC 2018


From: Ian Romanick <ian.d.romanick at intel.com>

Shader-db results:

The one shader hurt for instructions is a compute shader that had both
spills and fills hurt.

v2: Fix typo in comment noticed by Caio.

Skylake, Broadwell, and Haswell had similar results. (Skylake shown)
total instructions in shared programs: 15108590 -> 15083798 (-0.16%)
instructions in affected programs: 893759 -> 868967 (-2.77%)
helped: 3616
HURT: 1
helped stats (abs) min: 1 max: 181 x̄: 6.88 x̃: 4
helped stats (rel) min: 0.10% max: 25.00% x̄: 3.93% x̃: 3.20%
HURT stats (abs)   min: 92 max: 92 x̄: 92.00 x̃: 92
HURT stats (rel)   min: 1.92% max: 1.92% x̄: 1.92% x̃: 1.92%
95% mean confidence interval for instructions value: -7.09 -6.62
95% mean confidence interval for instructions %-change: -4.03% -3.82%
Instructions are helped.

total cycles in shared programs: 566165228 -> 565911206 (-0.04%)
cycles in affected programs: 69290937 -> 69036915 (-0.37%)
helped: 2600
HURT: 1050
helped stats (abs) min: 1 max: 4980 x̄: 180.20 x̃: 77
helped stats (rel) min: <.01% max: 71.30% x̄: 9.17% x̃: 5.60%
HURT stats (abs)   min: 1 max: 33336 x̄: 204.27 x̃: 20
HURT stats (rel)   min: <.01% max: 47.61% x̄: 2.95% x̃: 1.43%
95% mean confidence interval for cycles value: -106.62 -32.57
95% mean confidence interval for cycles %-change: -6.04% -5.33%
Cycles are helped.

total spills in shared programs: 11110 -> 11111 (<.01%)
spills in affected programs: 166 -> 167 (0.60%)
helped: 1
HURT: 1

total fills in shared programs: 23168 -> 23182 (0.06%)
fills in affected programs: 438 -> 452 (3.20%)
helped: 1
HURT: 1

Ivy Bridge
total instructions in shared programs: 12030850 -> 11999872 (-0.26%)
instructions in affected programs: 911114 -> 880136 (-3.40%)
helped: 3338
HURT: 18
helped stats (abs) min: 1 max: 99 x̄: 9.32 x̃: 6
helped stats (rel) min: 0.11% max: 31.18% x̄: 5.20% x̃: 3.32%
HURT stats (abs)   min: 2 max: 20 x̄: 7.89 x̃: 6
HURT stats (rel)   min: 0.70% max: 2.59% x̄: 1.63% x̃: 1.70%
95% mean confidence interval for instructions value: -9.52 -8.94
95% mean confidence interval for instructions %-change: -5.33% -4.99%
Instructions are helped.

total cycles in shared programs: 256248948 -> 255778729 (-0.18%)
cycles in affected programs: 70148230 -> 69678011 (-0.67%)
helped: 2745
HURT: 628
helped stats (abs) min: 1 max: 6100 x̄: 210.19 x̃: 90
helped stats (rel) min: <.01% max: 75.90% x̄: 9.68% x̃: 6.31%
HURT stats (abs)   min: 1 max: 31166 x̄: 170.00 x̃: 10
HURT stats (rel)   min: <.01% max: 36.36% x̄: 2.80% x̃: 0.57%
95% mean confidence interval for cycles value: -162.72 -116.09
95% mean confidence interval for cycles %-change: -7.72% -7.00%
Cycles are helped.

total spills in shared programs: 4570 -> 4558 (-0.26%)
spills in affected programs: 173 -> 161 (-6.94%)
helped: 3
HURT: 0

total fills in shared programs: 4823 -> 4814 (-0.19%)
fills in affected programs: 250 -> 241 (-3.60%)
helped: 3
HURT: 0

Sandy Bridge
total instructions in shared programs: 10831562 -> 10822747 (-0.08%)
instructions in affected programs: 235807 -> 226992 (-3.74%)
helped: 800
HURT: 0
helped stats (abs) min: 1 max: 88 x̄: 11.02 x̃: 8
helped stats (rel) min: 0.11% max: 23.08% x̄: 4.70% x̃: 3.36%
95% mean confidence interval for instructions value: -11.93 -10.10
95% mean confidence interval for instructions %-change: -5.00% -4.40%
Instructions are helped.

total cycles in shared programs: 154501635 -> 154382369 (-0.08%)
cycles in affected programs: 4031486 -> 3912220 (-2.96%)
helped: 582
HURT: 270
helped stats (abs) min: 1 max: 2556 x̄: 231.18 x̃: 58
helped stats (rel) min: 0.03% max: 39.24% x̄: 4.25% x̃: 1.75%
HURT stats (abs)   min: 1 max: 1966 x̄: 56.59 x̃: 12
HURT stats (rel)   min: 0.02% max: 67.10% x̄: 3.05% x̃: 0.70%
95% mean confidence interval for cycles value: -167.32 -112.65
95% mean confidence interval for cycles %-change: -2.40% -1.47%
Cycles are helped.

No change on Iron Lake or GM45.

Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
---
 src/intel/compiler/brw_nir.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 6ce8325a4dd..1d65107a93d 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -567,7 +567,18 @@ brw_nir_optimize(nir_shader *nir, const struct brw_compiler *compiler,
       OPT(nir_opt_dce);
       OPT(nir_opt_cse);
 
-      /* For indirect loads of uniforms (push constants), we assume that array
+      /* Passing 0 to the peephole select pass causes it to convert
+       * if-statements that contain only move instructions in the branches
+       * regardless of the count.
+       *
+       * Passing 1 to the peephole select pass causes it to convert
+       * if-statements that contain at most a single ALU instruction (total)
+       * in both branches.  Before Gen6, some math instructions were
+       * prohibitively expensive and the results of compare operations need an
+       * extra resolve step.  For these reasons, this pass is more harmful
+       * than good on those platforms.
+       *
+       * For indirect loads of uniforms (push constants), we assume that array
        * indices will nearly always be in bounds and the cost of the load is
        * low.  Therefore there shouldn't be a performance benefit to avoid it.
        * However, in vec4 tessellation shaders, these loads operate by
@@ -577,6 +588,8 @@ brw_nir_optimize(nir_shader *nir, const struct brw_compiler *compiler,
          (nir->info.stage == MESA_SHADER_TESS_CTRL ||
           nir->info.stage == MESA_SHADER_TESS_EVAL);
       OPT(nir_opt_peephole_select, 0, is_vec4_tessellation);
+      if (compiler->devinfo->gen >= 6)
+         OPT(nir_opt_peephole_select, 1, is_vec4_tessellation);
 
       OPT(nir_opt_intrinsics);
       OPT(nir_opt_algebraic);
-- 
2.14.4



More information about the mesa-dev mailing list