[Mesa-dev] [PATCH] intel/fs: Optimize and simplify the copy propagation dataflow logic.

Francisco Jerez currojerez at riseup.net
Wed Dec 20 02:19:57 UTC 2017


Previously the dataflow propagation algorithm would calculate the ACP
live-in and -out sets in a two-pass fixed-point algorithm.  The first
pass would update the live-out sets of all basic blocks of the program
based on their live-in sets, while the second pass would update the
live-in sets based on the live-out sets.  This is incredibly
inefficient in the typical case where the CFG of the program is
approximately acyclic, because it can take up to 2*n passes for an ACP
entry introduced at the top of the program to reach the bottom (where
n is the number of basic blocks in the program), until which point the
algorithm won't be able to reach a fixed point.

The same effect can be achieved in a single pass by computing the
live-in and -out sets in lock-step, because that makes sure that
processing of any basic block will pick up the updated live-out sets
of the lexically preceding blocks.  This gives the dataflow
propagation algorithm effectively O(n) run-time instead of O(n^2) in
the acyclic case.

The time spent in dataflow propagation is reduced by 30x in the
GLES31.functional.ssbo.layout.random.all_shared_buffer.5 dEQP
test-case on my CHV system (the improvement is likely to be of the
same order of magnitude on other platforms).  This more than reverses
an apparent run-time regression in this test-case from my previous
copy-propagation undefined-value handling patch, which was ultimately
caused by the additional work introduced in that commit to account for
undefined values being multiplied by a huge quadratic factor.

According to Chad this test was failing on CHV due to a 30s time-out
imposed by the Android CTS (this was the case regardless of my
undefined-value handling patch, even though my patch substantially
exacerbated the issue).  On my CHV system this patch reduces the
overall run-time of the test by approximately 12x, getting us to
around 13s, well below the time-out.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104271
Reported-by: Chad Versace <chadversary at chromium.org>
---
 src/intel/compiler/brw_fs_copy_propagation.cpp | 30 ++++++++------------------
 1 file changed, 9 insertions(+), 21 deletions(-)

diff --git a/src/intel/compiler/brw_fs_copy_propagation.cpp b/src/intel/compiler/brw_fs_copy_propagation.cpp
index af5635eacef..8d6ab18cd13 100644
--- a/src/intel/compiler/brw_fs_copy_propagation.cpp
+++ b/src/intel/compiler/brw_fs_copy_propagation.cpp
@@ -228,34 +228,17 @@ fs_copy_prop_dataflow::run()
    do {
       progress = false;
 
-      /* Update liveout for all blocks. */
       foreach_block (block, cfg) {
          if (block->parents.is_empty())
             continue;
 
          for (int i = 0; i < bitset_words; i++) {
             const BITSET_WORD old_liveout = bd[block->num].liveout[i];
-
-            bd[block->num].liveout[i] =
-               bd[block->num].copy[i] | (bd[block->num].livein[i] &
-                                         ~bd[block->num].kill[i]);
-
-            if (old_liveout != bd[block->num].liveout[i])
-               progress = true;
-         }
-      }
-
-      /* Update livein for all blocks.  If a copy is live out of all parent
-       * blocks, it's live coming in to this block.
-       */
-      foreach_block (block, cfg) {
-         if (block->parents.is_empty())
-            continue;
-
-         for (int i = 0; i < bitset_words; i++) {
-            const BITSET_WORD old_livein = bd[block->num].livein[i];
             BITSET_WORD livein_from_any_block = 0;
 
+            /* Update livein for this block.  If a copy is live out of all
+             * parent blocks, it's live coming in to this block.
+             */
             bd[block->num].livein[i] = ~0u;
             foreach_list_typed(bblock_link, parent_link, link, &block->parents) {
                bblock_t *parent = parent_link->block;
@@ -278,7 +261,12 @@ fs_copy_prop_dataflow::run()
              */
             bd[block->num].livein[i] &= livein_from_any_block;
 
-            if (old_livein != bd[block->num].livein[i])
+            /* Update liveout for this block. */
+            bd[block->num].liveout[i] =
+               bd[block->num].copy[i] | (bd[block->num].livein[i] &
+                                         ~bd[block->num].kill[i]);
+
+            if (old_liveout != bd[block->num].liveout[i])
                progress = true;
          }
       }
-- 
2.14.2



More information about the mesa-dev mailing list