[Mesa-stable] [PATCH 3/4] intel/cfg: Always add both successors to a break
Jason Ekstrand
jason at jlekstrand.net
Wed Oct 4 23:58:36 UTC 2017
Shader-db results on Sky Lake:
total instructions in shared programs: 12955125 -> 12953698 (-0.01%)
instructions in affected programs: 55956 -> 54529 (-2.55%)
helped: 6
HURT: 38
All of the hurt programs were hurt by exactly one instruction because
this patch affects copy propagation. Most of the helped instructions
came from a single orbital explorer shader that was helped by 14.26%
Cc: mesa-stable at lists.freedesktop.org
---
src/intel/compiler/brw_cfg.cpp | 37 +++++++++++++++++++++++++++++++++++--
1 file changed, 35 insertions(+), 2 deletions(-)
diff --git a/src/intel/compiler/brw_cfg.cpp b/src/intel/compiler/brw_cfg.cpp
index fad12ee..d8bf725 100644
--- a/src/intel/compiler/brw_cfg.cpp
+++ b/src/intel/compiler/brw_cfg.cpp
@@ -289,9 +289,42 @@ cfg_t::cfg_t(exec_list *instructions)
assert(cur_while != NULL);
cur->add_successor(mem_ctx, cur_while);
+ /* We also add the next block as a successor of the break. If the
+ * break is predicated, we need to do this because the break may not
+ * be taken. If the break is not predicated, we add it anyway so
+ * that our live intervals computations will operate as if the break
+ * may or may not be taken. Consider the following example:
+ *
+ * vec4 color2;
+ * while (1) {
+ * vec4 color = texture();
+ * if (...) {
+ * color2 = color * 2;
+ * break;
+ * }
+ * }
+ * gl_FragColor = color2;
+ *
+ * In this case, the definition of color2 dominates the use because
+ * the loop only has the one exit. This means that the live range
+ * interval for color2 goes from the statement in the if to it's use
+ * below the loop. Now suppose that the texture operation has a
+ * header register that gets assigned one of the registers used for
+ * color2. If the loop condition is non-uniform and some of the
+ * threads will take the break and others will continue. In this
+ * case, the next pass through the loop, the WE_all setup of the
+ * header register will stomp the disabled channels of color2 and
+ * corrupt the value.
+ *
+ * This same problem can occur if you have a mix of 64, 32, and
+ * 16-bit registers because the channels do not line up or if you
+ * have a SIMD16 program and the first half of one value overlaps the
+ * second half of the other. To solve it, we simply treat the break
+ * as if it may also continue on because some of the threads may
+ * continue on.
+ */
next = new_block();
- if (inst->predicate)
- cur->add_successor(mem_ctx, next);
+ cur->add_successor(mem_ctx, next);
set_next_block(&cur, next, ip);
break;
--
2.5.0.400.gff86faf
More information about the mesa-stable
mailing list