[Mesa-dev] [PATCH 2/3] i965/fs: Handle CMP.nz ... 0 and AND.nz ... 1 similarly in cmod propagation

Ian Romanick idr at freedesktop.org
Thu Feb 5 22:55:52 PST 2015


From: Ian Romanick <ian.d.romanick at intel.com>

Espically on platforms that do not natively generate 0u and ~0u for
Boolean results, we generate a lot of sequences where a CMP is
followed by an AND with 1.  emit_bool_to_cond_code does this, for
example.  On ILK, this results in a sequence like:

    add(8)          g3<1>F          g8<8,8,1>F      -g4<0,1,0>F
    cmp.l.f0(8)     g3<1>D          g3<8,8,1>F      0F
    and.nz.f0(8)    null            g3<8,8,1>D      1D
    (+f0) iff(8)    Jump: 6

The AND.nz is obviously redundant.  By propagating the cmod, we can
instead generate

    add.l.f0(8)     null            g8<8,8,1>F      -g4<0,1,0>F
    (+f0) iff(8)    Jump: 6

Existing code already handles the propagation from the CMP to the ADD.

Shader-db results:

GM45 (0x2A42):
total instructions in shared programs: 3542267 -> 3541013 (-0.04%)
instructions in affected programs:     169385 -> 168131 (-0.74%)
helped:                                684
HURT:                                  0
GAINED:                                0
LOST:                                  0

Iron Lake (0x0046):
total instructions in shared programs: 4864611 -> 4863357 (-0.03%)
instructions in affected programs:     166050 -> 164796 (-0.76%)
helped:                                684
HURT:                                  0
GAINED:                                0
LOST:                                  0

Sandy Bridge (0x0116):
total instructions in shared programs: 6853550 -> 6853550 (0.00%)
instructions in affected programs:     0 -> 0
helped:                                0
HURT:                                  0
GAINED:                                0
LOST:                                  0

Ivy Bridge (0x0166):
total instructions in shared programs: 6324560 -> 6324484 (-0.00%)
instructions in affected programs:     18283 -> 18207 (-0.42%)
helped:                                48
HURT:                                  0
GAINED:                                0
LOST:                                  0

Haswell (0x0426):
total instructions in shared programs: 5952024 -> 5951948 (-0.00%)
instructions in affected programs:     18208 -> 18132 (-0.42%)
helped:                                48
HURT:                                  0
GAINED:                                0
LOST:                                  0

Broadwell (0x162E):
total instructions in shared programs: 7040944 -> 7040870 (-0.00%)
instructions in affected programs:     17324 -> 17250 (-0.43%)
helped:                                46
HURT:                                  0
GAINED:                                0
LOST:                                  0

Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
---
 .../drivers/dri/i965/brw_fs_cmod_propagation.cpp   | 26 +++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
index c6384ab..6d3a2f5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
@@ -57,7 +57,8 @@ opt_cmod_propagation_local(fs_visitor *v, bblock_t *block)
    foreach_inst_in_block_reverse_safe(fs_inst, inst, block) {
       ip--;
 
-      if ((inst->opcode != BRW_OPCODE_CMP &&
+      if ((inst->opcode != BRW_OPCODE_AND &&
+           inst->opcode != BRW_OPCODE_CMP &&
            inst->opcode != BRW_OPCODE_MOV) ||
           inst->predicate != BRW_PREDICATE_NONE ||
           !inst->dst.is_null() ||
@@ -65,6 +66,19 @@ opt_cmod_propagation_local(fs_visitor *v, bblock_t *block)
           inst->src[0].abs)
          continue;
 
+      /* Only an AND.NZ can be propagated.  Many AND.Z instructions are
+       * generated (for ir_unop_not in fs_visitor::emit_bool_to_cond_code).
+       * Propagating those would require inverting the condition on the CMP.
+       * This changes both the flag value and the register destination of the
+       * CMP.  That result may be used elsewhere, so we can't change its value
+       * on a whim.
+       */
+      if (inst->opcode == BRW_OPCODE_AND &&
+          !(inst->src[1].is_one() &&
+            inst->conditional_mod == BRW_CONDITIONAL_NZ &&
+            !inst->src[0].negate))
+         continue;
+
       if (inst->opcode == BRW_OPCODE_CMP && !inst->src[1].is_zero())
          continue;
 
@@ -80,6 +94,16 @@ opt_cmod_propagation_local(fs_visitor *v, bblock_t *block)
                 scan_inst->dst.reg_offset != inst->src[0].reg_offset)
                break;
 
+            if (inst->opcode == BRW_OPCODE_AND) {
+               if (scan_inst->opcode == BRW_OPCODE_CMP &&
+                   scan_inst->writes_flag()) {
+                  inst->remove(block);
+                  progress = true;
+               }
+
+               break;
+            }
+
             /* If the instruction generating inst's source also wrote the
              * flag, and inst is doing a simple .nz comparison, then inst
              * is redundant - the appropriate value is already in the flag
-- 
2.1.0



More information about the mesa-dev mailing list