[Mesa-dev] [PATCH 2/2] i965: Make DCE set null destinations on messages with side effects.

Kenneth Graunke kenneth at whitecape.org
Sat Jan 14 01:20:03 UTC 2017


(Co-authored by Matt Turner.)

Image atomics, for example, return a value - but the shader may not
want to use it.  We assigned a useless VGRF destination.  This seemed
harmless, but it can actually be quite harmful.  The register allocator
has to assign that VGRF to a real register.  It may assign the same
actual GRF to the destination of an instruction that follows soon after.

This results in a write-after-write (WAW) dependency, and stall.

A number of "Deus Ex: Mankind Divided" shaders use image atomics, but
don't use the return value.  Several of these were hitting WAW stalls
for nearly 14,000 (poorly estimated) cycles a pop.  Making dead code
elimination null out the destination avoids this issue.

This patch cuts one shader's estimated cycles by -98.39%!  Removing the
message response should also help with data cluster bandwidth.

On Skylake:

total instructions in shared programs: 13366907 -> 13363051 (-0.03%)
instructions in affected programs: 49635 -> 45779 (-7.77%)
helped: 133
HURT: 0

total cycles in shared programs: 255433388 -> 248081818 (-2.88%)
cycles in affected programs: 12370702 -> 5019132 (-59.43%)
helped: 100
HURT: 24

Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
---
 .../dri/i965/brw_fs_dead_code_eliminate.cpp        | 56 ++++++++++++++++------
 1 file changed, 41 insertions(+), 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
index 930dc733b45..885ae2638a8 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
@@ -34,6 +34,43 @@
  * yet in the tail end of this block.
  */
 
+static bool
+can_eliminate(const fs_inst *inst, BITSET_WORD *flag_live)
+{
+    return inst->opcode != BRW_OPCODE_IF &&
+           inst->opcode != BRW_OPCODE_WHILE &&
+           !inst->has_side_effects() &&
+           !(flag_live[0] & inst->flags_written()) &&
+           !inst->writes_accumulator;
+}
+
+static bool
+can_omit_write(const fs_inst *inst, BITSET_WORD *flag_live)
+{
+   switch (inst->opcode) {
+   case SHADER_OPCODE_UNTYPED_ATOMIC:
+   case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL:
+   case SHADER_OPCODE_TYPED_ATOMIC:
+   case SHADER_OPCODE_TYPED_ATOMIC_LOGICAL:
+      return true;
+   default:
+      /* If we're going to eliminate the instruction entirely, omitting the
+       * write is always safe.
+       */
+      if (can_eliminate(inst, flag_live))
+         return true;
+
+      /* We can eliminate the destination write for ordinary instructions,
+       * but not most SENDs.
+       */
+      if (inst->opcode < 128 && inst->mlen == 0)
+         return true;
+
+      /* It might not be safe for other virtual opcodes. */
+      return false;
+   }
+}
+
 bool
 fs_visitor::dead_code_eliminate()
 {
@@ -52,31 +89,20 @@ fs_visitor::dead_code_eliminate()
              sizeof(BITSET_WORD));
 
       foreach_inst_in_block_reverse_safe(fs_inst, inst, block) {
-         if (inst->dst.file == VGRF && !inst->has_side_effects()) {
+         if (inst->dst.file == VGRF) {
             const unsigned var = live_intervals->var_from_reg(inst->dst);
             bool result_live = false;
 
             for (unsigned i = 0; i < regs_written(inst); i++)
                result_live |= BITSET_TEST(live, var + i);
 
-            if (!result_live) {
+            if (!result_live && can_omit_write(inst, flag_live)) {
+               inst->dst = fs_reg(retype(brw_null_reg(), inst->dst.type));
                progress = true;
-
-               if (inst->writes_accumulator || inst->flags_written()) {
-                  inst->dst = fs_reg(retype(brw_null_reg(), inst->dst.type));
-               } else {
-                  inst->opcode = BRW_OPCODE_NOP;
-               }
             }
          }
 
-         if ((inst->opcode != BRW_OPCODE_IF &&
-              inst->opcode != BRW_OPCODE_WHILE) &&
-             inst->dst.is_null() &&
-             !inst->has_side_effects() &&
-             !(flag_live[0] & inst->flags_written()) &&
-             !inst->flags_written() &&
-             !inst->writes_accumulator) {
+         if (inst->dst.is_null() && can_eliminate(inst, flag_live)) {
             inst->opcode = BRW_OPCODE_NOP;
             progress = true;
          }
-- 
2.11.0



More information about the mesa-dev mailing list