Mesa (main): pan/bi: Specialize IDVS in NIR
GitLab Mirror
gitlab-mirror at kemper.freedesktop.org
Tue Feb 22 15:36:00 UTC 2022
Module: Mesa
Branch: main
Commit: e0e63c2a8e6dba9d5806aebe355f16a0431fe64b
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=e0e63c2a8e6dba9d5806aebe355f16a0431fe64b
Author: Alyssa Rosenzweig <alyssa at collabora.com>
Date: Fri Feb 18 19:20:27 2022 -0500
pan/bi: Specialize IDVS in NIR
It's a bit more code, but it's needed to chew through control flow since we
don't have a backend version of dead_cf. Results are really good, meaning I
really screwed this up the first time around (hence the cc mesa-stable).
total instructions in shared programs: 1963576 -> 1939513 (-1.23%)
instructions in affected programs: 671053 -> 646990 (-3.59%)
helped: 4436
HURT: 729
helped stats (abs) min: 1.0 max: 43.0 x̄: 5.75 x̃: 6
helped stats (rel) min: 0.21% max: 100.00% x̄: 6.47% x̃: 5.17%
HURT stats (abs) min: 1.0 max: 22.0 x̄: 2.01 x̃: 1
HURT stats (rel) min: 0.50% max: 50.00% x̄: 10.45% x̃: 9.09%
95% mean confidence interval for instructions value: -4.77 -4.55
95% mean confidence interval for instructions %-change: -4.36% -3.80%
Instructions are helped.
total tuples in shared programs: 1533335 -> 1523194 (-0.66%)
tuples in affected programs: 483167 -> 473026 (-2.10%)
helped: 3414
HURT: 1288
helped stats (abs) min: 1.0 max: 20.0 x̄: 3.73 x̃: 2
helped stats (rel) min: 0.27% max: 100.00% x̄: 4.87% x̃: 3.03%
HURT stats (abs) min: 1.0 max: 19.0 x̄: 2.02 x̃: 1
HURT stats (rel) min: 0.24% max: 38.10% x̄: 8.10% x̃: 5.88%
95% mean confidence interval for tuples value: -2.28 -2.03
95% mean confidence interval for tuples %-change: -1.62% -1.02%
Tuples are helped.
total clauses in shared programs: 351432 -> 329158 (-6.34%)
clauses in affected programs: 142237 -> 119963 (-15.66%)
helped: 5328
HURT: 3
helped stats (abs) min: 1.0 max: 43.0 x̄: 4.18 x̃: 4
helped stats (rel) min: 0.74% max: 100.00% x̄: 19.44% x̃: 17.24%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 9.09% max: 12.50% x̄: 10.90% x̃: 11.11%
95% mean confidence interval for clauses value: -4.25 -4.11
95% mean confidence interval for clauses %-change: -19.72% -19.12%
Clauses are helped.
total cycles in shared programs: 202830.92 -> 172084.50 (-15.16%)
cycles in affected programs: 117078.42 -> 86332 (-26.26%)
helped: 5450
HURT: 1
helped stats (abs) min: 0.083333 max: 49.0 x̄: 5.64 x̃: 5
helped stats (rel) min: 1.42% max: 100.00% x̄: 27.94% x̃: 25.64%
HURT stats (abs) min: 0.25 max: 0.25 x̄: 0.25 x̃: 0
HURT stats (rel) min: 2.46% max: 2.46% x̄: 2.46% x̃: 2.46%
95% mean confidence interval for cycles value: -5.74 -5.54
95% mean confidence interval for cycles %-change: -28.30% -27.58%
Cycles are helped.
total arith in shared programs: 57274.29 -> 57145.04 (-0.23%)
arith in affected programs: 16418.33 -> 16289.08 (-0.79%)
helped: 2442
HURT: 1784
helped stats (abs) min: 0.041665999999999315 max: 0.75 x̄: 0.14 x̃: 0
helped stats (rel) min: 0.23% max: 100.00% x̄: 5.51% x̃: 2.87%
HURT stats (abs) min: 0.041665999999999315 max: 0.9166670000000003 x̄: 0.12 x̃: 0
HURT stats (rel) min: 0.00% max: 100.00% x̄: 25.13% x̃: 9.09%
95% mean confidence interval for arith value: -0.04 -0.03
95% mean confidence interval for arith %-change: 6.61% 8.24%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).
total texture in shared programs: 12857 -> 12857 (0.00%)
texture in affected programs: 0 -> 0
helped: 0
HURT: 0
total vary in shared programs: 11157.75 -> 11157.75 (0.00%)
vary in affected programs: 0 -> 0
helped: 0
HURT: 0
total ldst in shared programs: 177208 -> 146420 (-17.37%)
ldst in affected programs: 117098 -> 86310 (-26.29%)
helped: 5447
HURT: 0
helped stats (abs) min: 1.0 max: 49.0 x̄: 5.65 x̃: 5
helped stats (rel) min: 1.92% max: 100.00% x̄: 27.91% x̃: 25.64%
95% mean confidence interval for ldst value: -5.75 -5.55
95% mean confidence interval for ldst %-change: -28.27% -27.56%
Ldst are helped.
total quadwords in shared programs: 1436507 -> 1398329 (-2.66%)
quadwords in affected programs: 515101 -> 476923 (-7.41%)
helped: 5150
HURT: 111
helped stats (abs) min: 1.0 max: 39.0 x̄: 7.46 x̃: 6
helped stats (rel) min: 0.17% max: 100.00% x̄: 10.02% x̃: 8.24%
HURT stats (abs) min: 1.0 max: 9.0 x̄: 2.01 x̃: 1
HURT stats (rel) min: 0.43% max: 21.62% x̄: 3.57% x̃: 1.94%
95% mean confidence interval for quadwords value: -7.41 -7.11
95% mean confidence interval for quadwords %-change: -9.98% -9.49%
Quadwords are helped.
total threads in shared programs: 35025 -> 35228 (0.58%)
threads in affected programs: 218 -> 421 (93.12%)
helped: 208
HURT: 5
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.91 0.99
95% mean confidence interval for threads %-change: 93.40% 99.55%
Threads are helped.
total loops in shared programs: 128 -> 125 (-2.34%)
loops in affected programs: 3 -> 0
helped: 3
HURT: 0
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
total spills in shared programs: 158 -> 149 (-5.70%)
spills in affected programs: 15 -> 6 (-60.00%)
helped: 9
HURT: 0
total fills in shared programs: 1133 -> 966 (-14.74%)
fills in affected programs: 197 -> 30 (-84.77%)
helped: 9
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa at collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090>
---
src/panfrost/bifrost/bifrost_compile.c | 58 +++++++++++++++++++++++++++++-----
1 file changed, 50 insertions(+), 8 deletions(-)
diff --git a/src/panfrost/bifrost/bifrost_compile.c b/src/panfrost/bifrost/bifrost_compile.c
index ea251343cfb..93be2dea0ec 100644
--- a/src/panfrost/bifrost/bifrost_compile.c
+++ b/src/panfrost/bifrost/bifrost_compile.c
@@ -695,6 +695,27 @@ bi_should_remove_store(nir_intrinsic_instr *intr, enum bi_idvs_mode idvs)
}
}
+static bool
+bifrost_nir_specialize_idvs(nir_builder *b, nir_instr *instr, void *data)
+{
+ enum bi_idvs_mode *idvs = data;
+
+ if (instr->type != nir_instr_type_intrinsic)
+ return false;
+
+ nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr);
+
+ if (intr->intrinsic != nir_intrinsic_store_output)
+ return false;
+
+ if (bi_should_remove_store(intr, *idvs)) {
+ nir_instr_remove(instr);
+ return true;
+ }
+
+ return false;
+}
+
static void
bi_emit_store_vary(bi_builder *b, nir_intrinsic_instr *instr)
{
@@ -710,12 +731,6 @@ bi_emit_store_vary(bi_builder *b, nir_intrinsic_instr *instr)
unsigned imm_index = 0;
bool immediate = bi_is_intr_immediate(instr, &imm_index, 16);
- /* Skip stores to the wrong kind of variable in a specialized IDVS
- * shader. Backend dead code elimination will clean up the mess.
- */
- if (bi_should_remove_store(instr, b->shader->idvs))
- return;
-
/* Only look at the total components needed. In effect, we fill in all
* the intermediate "holes" in the write mask, since we can't mask off
* stores. Since nir_lower_io_to_temporaries ensures each varying is
@@ -3607,8 +3622,6 @@ bi_optimize_nir(nir_shader *nir, unsigned gpu_id, bool is_blend)
NIR_PASS_V(nir, nir_shader_instructions_pass,
nir_invalidate_divergence, nir_metadata_all, NULL);
}
-
- NIR_PASS(progress, nir, nir_convert_from_ssa, true);
}
/* The cmdstream lowers 8-bit fragment output as 16-bit, so we need to do the
@@ -3871,6 +3884,35 @@ bi_compile_variant_nir(nir_shader *nir,
ctx->info = info;
ctx->idvs = idvs;
+ if (idvs != BI_IDVS_NONE) {
+ /* Specializing shaders for IDVS is destructive, so we need to
+ * clone. However, the last (second) IDVS shader does not need
+ * to be preserved so we can skip cloning that one.
+ */
+ if (offset == 0)
+ ctx->nir = nir = nir_shader_clone(ctx, nir);
+
+ NIR_PASS_V(nir, nir_shader_instructions_pass,
+ bifrost_nir_specialize_idvs,
+ nir_metadata_block_index | nir_metadata_dominance,
+ &idvs);
+
+ /* After specializing, clean up the mess */
+ bool progress = true;
+
+ while (progress) {
+ progress = false;
+
+ NIR_PASS(progress, nir, nir_opt_dce);
+ NIR_PASS(progress, nir, nir_opt_dead_cf);
+ }
+ }
+
+ /* We can only go out-of-SSA after speciailizing IDVS, as opt_dead_cf
+ * doesn't know how to deal with nir_register.
+ */
+ NIR_PASS_V(nir, nir_convert_from_ssa, true);
+
/* If nothing is pushed, all UBOs need to be uploaded */
ctx->ubo_mask = ~0;
More information about the mesa-commit
mailing list