[Mesa-dev] [PATCH] i965/fs: Run all of the optimisations after lower_load_payload

Neil Roberts neil at linux.intel.com
Thu Oct 15 07:16:46 PDT 2015


Instead of just running a couple of the possible optimisations in one
single iteration, it now runs the whole loop again after lowering the
load payloads. According to shader-db this gives:

total instructions in shared programs: 6493365 -> 6493289 (-0.00%)
instructions in affected programs:     1696 -> 1620 (-4.48%)
total loops in shared programs:        2237 -> 2237 (0.00%)
helped:                                20
HURT:                                  0

Most of the shaders just benefit from running the register coalesce
pass multiple times. However the following two additionally benefit
from an extra pass of opt_saturate_propagation which causes a bunch of
further optimisations:

steam-metro-last-light-1719.shader_test
steam-metro-last-light-836.shader_test

The optimisations that get run after lowering the load payloads are:

04-14-register_coalesce
04-17-compact_virtual_grfs
05-12-opt_saturate_propagation
06-02-opt_algebraic
06-04-opt_copy_propagate
06-07-dead_code_eliminate
06-12-opt_saturate_propagation
06-14-register_coalesce
06-17-compact_virtual_grfs
07-18-opt_combine_constants

In the first shader this drops four instructions.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 94 +++++++++++++++++++-----------------
 src/mesa/drivers/dri/i965/brw_fs.h   |  3 ++
 2 files changed, 53 insertions(+), 44 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 01a7c99..8d1db23 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4754,33 +4754,6 @@ fs_visitor::calculate_register_pressure()
    }
 }
 
-void
-fs_visitor::optimize()
-{
-   /* Start by validating the shader we currently have. */
-   validate();
-
-   /* bld is the common builder object pointing at the end of the program we
-    * used to translate it into i965 IR.  For the optimization and lowering
-    * passes coming next, any code added after the end of the program without
-    * having explicitly called fs_builder::at() clearly points at a mistake.
-    * Ideally optimization passes wouldn't be part of the visitor so they
-    * wouldn't have access to bld at all, but they do, so just in case some
-    * pass forgets to ask for a location explicitly set it to NULL here to
-    * make it trip.  The dispatch width is initialized to a bogus value to
-    * make sure that optimizations set the execution controls explicitly to
-    * match the code they are manipulating instead of relying on the defaults.
-    */
-   bld = fs_builder(this, 64);
-
-   assign_constant_locations();
-   demote_pull_constants();
-
-   validate();
-
-   split_virtual_grfs();
-   validate();
-
 #define OPT(pass, args...) ({                                           \
       pass_num++;                                                       \
       bool this_progress = pass(args);                                  \
@@ -4799,20 +4772,10 @@ fs_visitor::optimize()
       this_progress;                                                    \
    })
 
-   if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER)) {
-      char filename[64];
-      snprintf(filename, 64, "%s%d-%s-00-start",
-               stage_abbrev, dispatch_width, nir->info.name);
-
-      backend_shader::dump_instructions(filename);
-   }
-
-   bool progress = false;
-   int iteration = 0;
-   int pass_num = 0;
-
-   OPT(lower_simd_width);
-   OPT(lower_logical_sends);
+void
+fs_visitor::optimize_loop(int &iteration, int &pass_num)
+{
+   bool progress;
 
    do {
       progress = false;
@@ -4839,6 +4802,51 @@ fs_visitor::optimize()
 
       OPT(compact_virtual_grfs);
    } while (progress);
+}
+
+void
+fs_visitor::optimize()
+{
+   /* Start by validating the shader we currently have. */
+   validate();
+
+   /* bld is the common builder object pointing at the end of the program we
+    * used to translate it into i965 IR.  For the optimization and lowering
+    * passes coming next, any code added after the end of the program without
+    * having explicitly called fs_builder::at() clearly points at a mistake.
+    * Ideally optimization passes wouldn't be part of the visitor so they
+    * wouldn't have access to bld at all, but they do, so just in case some
+    * pass forgets to ask for a location explicitly set it to NULL here to
+    * make it trip.  The dispatch width is initialized to a bogus value to
+    * make sure that optimizations set the execution controls explicitly to
+    * match the code they are manipulating instead of relying on the defaults.
+    */
+   bld = fs_builder(this, 64);
+
+   assign_constant_locations();
+   demote_pull_constants();
+
+   validate();
+
+   split_virtual_grfs();
+   validate();
+
+   if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER)) {
+      char filename[64];
+      snprintf(filename, 64, "%s%d-%s-00-start",
+               stage_abbrev, dispatch_width, nir->info.name);
+
+      backend_shader::dump_instructions(filename);
+   }
+
+   bool progress = false;
+   int iteration = 0;
+   int pass_num = 0;
+
+   OPT(lower_simd_width);
+   OPT(lower_logical_sends);
+
+   optimize_loop(iteration, pass_num);
 
    pass_num = 0;
 
@@ -4846,9 +4854,7 @@ fs_visitor::optimize()
 
    if (OPT(lower_load_payload)) {
       split_virtual_grfs();
-      OPT(register_coalesce);
-      OPT(compute_to_mrf);
-      OPT(dead_code_eliminate);
+      optimize_loop(iteration, pass_num);
    }
 
    OPT(opt_combine_constants);
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h
index e8b511f..c12e9e8 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -386,6 +386,9 @@ public:
 
    unsigned promoted_constants;
    brw::fs_builder bld;
+
+private:
+   void optimize_loop(int &iteration, int &pass_num);
 };
 
 /**
-- 
1.9.3



More information about the mesa-dev mailing list