[Mesa-dev] [PATCH 1/2] i965: Make DCE set null destinations on messages with side effects.

Kenneth Graunke kenneth at whitecape.org
Tue Dec 13 06:50:55 UTC 2016


(Co-authored by Matt Turner.)

Image atomics, for example, return a value - but the shader may not
want to use it.  We assigned a useless VGRF destination.  This seemed
harmless, but it can actually be quite harmful.  The register allocator
has to assign that VGRF to a real register.  It may assign the same
actual GRF to the destination of an instruction that follows soon after.

This results in a write-after-write (WAW) dependency, and stall.

A number of "Deus Ex: Mankind Divided" shaders use image atomics, but
don't use the return value.  Several of these were hitting WAW stalls
for nearly 14,000 cycles a pop.  This patch cuts one shader's cycles
by -98.35%!

Making dead code elimination null out the destination avoids this issue.
We can drop the redundant NOP handling as well - a following hunk
already turns instructions with null destinations and no effects into
NOPs.

We do need to special case memory barriers as they need an actual
VGRF to function correctly, even though that registers serves only
to create register dependencies.

On Skylake:

total instructions in shared programs: 13700090 -> 13699698 (-0.00%)
instructions in affected programs: 13514 -> 13122 (-2.90%)
helped: 2
HURT: 0

total cycles in shared programs: 288879704 -> 278651496 (-3.54%)
cycles in affected programs: 15842112 -> 5613904 (-64.56%)
helped: 19
HURT: 3

total spills in shared programs: 15040 -> 14990 (-0.33%)
spills in affected programs: 140 -> 90 (-35.71%)
helped: 2
HURT: 0

total fills in shared programs: 17328 -> 17264 (-0.37%)
fills in affected programs: 228 -> 164 (-28.07%)
helped: 2
HURT: 0

Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
---
 src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
index 8a0469a..4c524a4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
@@ -52,7 +52,8 @@ fs_visitor::dead_code_eliminate()
              sizeof(BITSET_WORD));
 
       foreach_inst_in_block_reverse_safe(fs_inst, inst, block) {
-         if (inst->dst.file == VGRF && !inst->has_side_effects()) {
+         if (inst->dst.file == VGRF &&
+             inst->opcode != SHADER_OPCODE_MEMORY_FENCE) {
             const unsigned var = live_intervals->var_from_reg(inst->dst);
             bool result_live = false;
 
@@ -60,13 +61,8 @@ fs_visitor::dead_code_eliminate()
                result_live |= BITSET_TEST(live, var + i);
 
             if (!result_live) {
+               inst->dst = fs_reg(retype(brw_null_reg(), inst->dst.type));
                progress = true;
-
-               if (inst->writes_accumulator || inst->flags_written()) {
-                  inst->dst = fs_reg(retype(brw_null_reg(), inst->dst.type));
-               } else {
-                  inst->opcode = BRW_OPCODE_NOP;
-               }
             }
          }
 
-- 
2.10.2



More information about the mesa-dev mailing list