<div dir="ltr"><div><div><div><div><div>After giving it some thought, I don't think this patch is quite strong enough to fix the real bug.  The problem isn't that we're reswizzling a register.  The problem is that we're trying to coalesce something like<br><br></div>ssa_1 = fadd r1, r2<br></div>/* Some stuff */<br></div>r3 = vec4(ssa_1, ssa_1.y, ...)<br><br></div>coalescing this together moves the write to r3 up to the fadd even though there may have been other writes to r3 in between.<br><br></div><div>None the less, very good job tracking this down! :-)<br></div><div><br></div>--Jason<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 22, 2018 at 4:30 AM, vadym.shovkoplias <span dir="ltr"><<a href="mailto:vadim.shovkoplias@gmail.com" target="_blank">vadim.shovkoplias@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Bugzilla: <a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440" rel="noreferrer" target="_blank">https://bugs.freedesktop.org/<wbr>show_bug.cgi?id=105440</a><br>
Fixes: 2458ea95c56 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible"<br>
Signed-off-by: Andriy Khulap <<a href="mailto:andriy.khulap@globallogic.com">andriy.khulap@globallogic.com</a><wbr>><br>
Signed-off-by: Vadym Shovkoplias <<a href="mailto:vadym.shovkoplias@globallogic.com">vadym.shovkoplias@<wbr>globallogic.com</a>><br>
---<br>
 src/compiler/nir/nir_lower_<wbr>vec_to_movs.c | 7 ++++++-<br>
 1 file changed, 6 insertions(+), 1 deletion(-)<br>
<br>
diff --git a/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c b/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c<br>
index 711ddd3..4758f7d 100644<br>
--- a/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c<br>
+++ b/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c<br>
@@ -166,9 +166,14 @@ try_coalesce(nir_alu_instr *vec, unsigned start_idx)<br>
       /* If we are going to reswizzle the instruction, we can't have any<br>
        * non-per-component sources either.<br>
        */<br>
-      for (unsigned j = 0; j < nir_op_infos[src_alu->op].num_<wbr>inputs; j++)<br>
+      for (unsigned j = 0; j < nir_op_infos[src_alu->op].num_<wbr>inputs; j++) {<br>
          if (nir_op_infos[src_alu->op].<wbr>input_sizes[j] != 0)<br>
             return 0;<br>
+<br>
+         /* Don't coalesce the mmove when src and dest are the same reg */<br>
+         if (src_matches_dest_reg(&vec-><wbr>dest.dest, &src_alu->src[j].src))<br>
+            return 0;<br>
+      }<br>
    }<br>
<br>
    /* Stash off all of the ALU instruction's swizzles. */<br>
<span class="HOEnZb"><font color="#888888">--<br>
2.7.4<br>
<br>
______________________________<wbr>_________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br>
</font></span></blockquote></div><br></div>