[Mesa-dev] [PATCH 4/4] i965/vec4: Perform CSE on CMP(N) instructions.
Matt Turner
mattst88 at gmail.com
Wed Jun 25 14:12:14 PDT 2014
Port of commit b16b3c87 to the vec4 code.
No shader-db improvements, but might as well. The fs backend saw an
improvement because it's scalar and multiple identical CMP instructions
were generated by the SEL peepholes.
---
src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
index 8013517..1456e3a 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
@@ -56,6 +56,8 @@ is_expression(const vec4_instruction *const inst)
case BRW_OPCODE_SHR:
case BRW_OPCODE_SHL:
case BRW_OPCODE_ASR:
+ case BRW_OPCODE_CMP:
+ case BRW_OPCODE_CMPN:
case BRW_OPCODE_ADD:
case BRW_OPCODE_MUL:
case BRW_OPCODE_FRC:
@@ -135,7 +137,7 @@ vec4_visitor::opt_cse_local(bblock_t *block, exec_list *aeb)
/* Skip some cases. */
if (is_expression(inst) && !inst->predicate && inst->mlen == 0 &&
- !inst->conditional_mod)
+ (inst->dst.file != HW_REG || inst->dst.is_null()))
{
bool found = false;
@@ -200,6 +202,19 @@ vec4_visitor::opt_cse_local(bblock_t *block, exec_list *aeb)
foreach_list_safe(entry_node, aeb) {
aeb_entry *entry = (aeb_entry *)entry_node;
+ /* Kill all AEB entries that write a different value to or read from
+ * the flag register if we just wrote it.
+ */
+ if (inst->writes_flag()) {
+ if (entry->generator->reads_flag() ||
+ (entry->generator->writes_flag() &&
+ !instructions_match(inst, entry->generator))) {
+ entry->remove();
+ ralloc_free(entry);
+ continue;
+ }
+ }
+
for (int i = 0; i < 3; i++) {
src_reg *src = &entry->generator->src[i];
--
1.8.3.2
More information about the mesa-dev
mailing list