[Mesa-dev] [PATCH 3/4] intel/cfg: Always add both successors to a break

Jason Ekstrand jason at jlekstrand.net
Wed Oct 4 23:58:36 UTC 2017


Shader-db results on Sky Lake:

    total instructions in shared programs: 12955125 -> 12953698 (-0.01%)
    instructions in affected programs: 55956 -> 54529 (-2.55%)
    helped: 6
    HURT: 38

All of the hurt programs were hurt by exactly one instruction because
this patch affects copy propagation.  Most of the helped instructions
came from a single orbital explorer shader that was helped by 14.26%

Cc: mesa-stable at lists.freedesktop.org
---
 src/intel/compiler/brw_cfg.cpp | 37 +++++++++++++++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_cfg.cpp b/src/intel/compiler/brw_cfg.cpp
index fad12ee..d8bf725 100644
--- a/src/intel/compiler/brw_cfg.cpp
+++ b/src/intel/compiler/brw_cfg.cpp
@@ -289,9 +289,42 @@ cfg_t::cfg_t(exec_list *instructions)
          assert(cur_while != NULL);
 	 cur->add_successor(mem_ctx, cur_while);
 
+         /* We also add the next block as a successor of the break.  If the
+          * break is predicated, we need to do this because the break may not
+          * be taken.  If the break is not predicated, we add it anyway so
+          * that our live intervals computations will operate as if the break
+          * may or may not be taken.  Consider the following example:
+          *
+          *    vec4 color2;
+          *    while (1) {
+          *       vec4 color = texture();
+          *       if (...) {
+          *          color2 = color * 2;
+          *          break;
+          *       }
+          *    }
+          *    gl_FragColor = color2;
+          *
+          * In this case, the definition of color2 dominates the use because
+          * the loop only has the one exit.  This means that the live range
+          * interval for color2 goes from the statement in the if to it's use
+          * below the loop.  Now suppose that the texture operation has a
+          * header register that gets assigned one of the registers used for
+          * color2.  If the loop condition is non-uniform and some of the
+          * threads will take the break and others will continue.  In this
+          * case, the next pass through the loop, the WE_all setup of the
+          * header register will stomp the disabled channels of color2 and
+          * corrupt the value.
+          *
+          * This same problem can occur if you have a mix of 64, 32, and
+          * 16-bit registers because the channels do not line up or if you
+          * have a SIMD16 program and the first half of one value overlaps the
+          * second half of the other.  To solve it, we simply treat the break
+          * as if it may also continue on because some of the threads may
+          * continue on.
+          */
 	 next = new_block();
-	 if (inst->predicate)
-	    cur->add_successor(mem_ctx, next);
+	 cur->add_successor(mem_ctx, next);
 
 	 set_next_block(&cur, next, ip);
 	 break;
-- 
2.5.0.400.gff86faf



More information about the mesa-dev mailing list