[Mesa-dev] [PATCH v2 3/3] nv50/ir: run some passes multiple times

Mon Apr 3 15:58:22 UTC 2017

With the shader cache, compilation time matters less.

As a side effect we can write more optimizations to produce better optimized
code.

total instructions in shared programs : 3931743 -> 3917512 (-0.36%)
total gprs used in shared programs    : 481460 -> 481680 (0.05%)
total local used in shared programs   : 27481 -> 26761 (-2.62%)
total bytes used in shared programs   : 36032672 -> 35902648 (-0.36%)

                local        gpr       inst      bytes
    helped          48         133        3843        3843
      hurt           1         295          75          75

Signed-off-by: Karol Herbst <karolherbst at gmail.com>
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp        | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 0de84fe9fc..505de08573 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -3729,12 +3729,17 @@ Program::optimizeSSA(int level)
    RUN_PASS(1, CopyPropagation, run);
    RUN_PASS(1, MergeSplits, run);
    RUN_PASS(2, GlobalCSE, run);
-   RUN_PASS(1, LocalCSE, run);
-   RUN_PASS(2, AlgebraicOpt, run);
-   RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks
-   RUN_PASS(1, ConstantFolding, foldAll);
-   RUN_PASS(2, LateAlgebraicOpt, run);
-   RUN_PASS(1, Split64BitOpPreRA, run);
+   for (int i = 0; i < 2; ++i) {
+      RUN_PASS(1, LocalCSE, run);
+      RUN_PASS(2, AlgebraicOpt, run);
+      RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks
+      RUN_PASS(1, ConstantFolding, foldAll);
+      RUN_PASS(2, LateAlgebraicOpt, run);
+      // only once
+      if (i == 0)
+         RUN_PASS(1, Split64BitOpPreRA, run);
+      RUN_PASS(1, DeadCodeElim, buryAll);
+   }
    RUN_PASS(1, LoadPropagation, run);
    RUN_PASS(1, IndirectPropagation, run);
    RUN_PASS(2, MemoryOpt, run);
-- 
2.12.2