[Mesa-dev] [PATCH 3/5] i965/fs: Handle CMP.nz ... 0 and AND.nz ... 1 similarly in cmod propagation
Ian Romanick
idr at freedesktop.org
Wed Mar 11 13:44:14 PDT 2015
From: Ian Romanick <ian.d.romanick at intel.com>
Espically on platforms that do not natively generate 0u and ~0u for
Boolean results, we generate a lot of sequences where a CMP is
followed by an AND with 1. emit_bool_to_cond_code does this, for
example. On ILK, this results in a sequence like:
add(8) g3<1>F g8<8,8,1>F -g4<0,1,0>F
cmp.l.f0(8) g3<1>D g3<8,8,1>F 0F
and.nz.f0(8) null g3<8,8,1>D 1D
(+f0) iff(8) Jump: 6
The AND.nz is obviously redundant. By propagating the cmod, we can
instead generate
add.l.f0(8) null g8<8,8,1>F -g4<0,1,0>F
(+f0) iff(8) Jump: 6
Existing code already handles the propagation from the CMP to the ADD.
Shader-db results:
GM45 (0x2A42):
total instructions in shared programs: 3550829 -> 3550788 (-0.00%)
instructions in affected programs: 10028 -> 9987 (-0.41%)
helped: 24
Iron Lake (0x0046):
total instructions in shared programs: 4993146 -> 4993105 (-0.00%)
instructions in affected programs: 9675 -> 9634 (-0.42%)
helped: 24
Ivy Bridge (0x0166):
total instructions in shared programs: 6291870 -> 6291794 (-0.00%)
instructions in affected programs: 17914 -> 17838 (-0.42%)
helped: 48
Haswell (0x0426):
total instructions in shared programs: 5779256 -> 5779180 (-0.00%)
instructions in affected programs: 16694 -> 16618 (-0.46%)
helped: 48
Broadwell (0x162E):
total instructions in shared programs: 6823088 -> 6823014 (-0.00%)
instructions in affected programs: 15824 -> 15750 (-0.47%)
helped: 46
No chage on Sandy Bridge or on any platform when NIR is used.
Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
Cc: Matt Turner <mattst88 at gmail.com>
---
.../drivers/dri/i965/brw_fs_cmod_propagation.cpp | 32 +++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
index d0ca2f9..6df3d45 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
@@ -57,7 +57,8 @@ opt_cmod_propagation_local(bblock_t *block)
foreach_inst_in_block_reverse_safe(fs_inst, inst, block) {
ip--;
- if ((inst->opcode != BRW_OPCODE_CMP &&
+ if ((inst->opcode != BRW_OPCODE_AND &&
+ inst->opcode != BRW_OPCODE_CMP &&
inst->opcode != BRW_OPCODE_MOV) ||
inst->predicate != BRW_PREDICATE_NONE ||
!inst->dst.is_null() ||
@@ -65,6 +66,19 @@ opt_cmod_propagation_local(bblock_t *block)
inst->src[0].abs)
continue;
+ /* Only an AND.NZ can be propagated. Many AND.Z instructions are
+ * generated (for ir_unop_not in fs_visitor::emit_bool_to_cond_code).
+ * Propagating those would require inverting the condition on the CMP.
+ * This changes both the flag value and the register destination of the
+ * CMP. That result may be used elsewhere, so we can't change its value
+ * on a whim.
+ */
+ if (inst->opcode == BRW_OPCODE_AND &&
+ !(inst->src[1].is_one() &&
+ inst->conditional_mod == BRW_CONDITIONAL_NZ &&
+ !inst->src[0].negate))
+ continue;
+
if (inst->opcode == BRW_OPCODE_CMP && !inst->src[1].is_zero())
continue;
@@ -80,6 +94,22 @@ opt_cmod_propagation_local(bblock_t *block)
scan_inst->dst.reg_offset != inst->src[0].reg_offset)
break;
+ /* This must be done before the dst.type check because the result
+ * type of the AND will always be D, but the result of the CMP
+ * could be anything. The assumption is that the AND is just
+ * figuring out what the result of the previous comparison was
+ * instead of doing a new comparison with a different type.
+ */
+ if (inst->opcode == BRW_OPCODE_AND) {
+ if (scan_inst->opcode == BRW_OPCODE_CMP &&
+ scan_inst->writes_flag()) {
+ inst->remove(block);
+ progress = true;
+ }
+
+ break;
+ }
+
/* Comparisons operate differently for ints and floats */
if (scan_inst->dst.type != inst->dst.type)
break;
--
2.1.0
More information about the mesa-dev
mailing list