<div dir="ltr"><div><div><div><div><div>After giving it some thought, I don't think this patch is quite strong enough to fix the real bug. The problem isn't that we're reswizzling a register. The problem is that we're trying to coalesce something like<br><br></div>ssa_1 = fadd r1, r2<br></div>/* Some stuff */<br></div>r3 = vec4(ssa_1, ssa_1.y, ...)<br><br></div>coalescing this together moves the write to r3 up to the fadd even though there may have been other writes to r3 in between.<br><br></div><div>None the less, very good job tracking this down! :-)<br></div><div><br></div>--Jason<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 22, 2018 at 4:30 AM, vadym.shovkoplias <span dir="ltr"><<a href="mailto:vadim.shovkoplias@gmail.com" target="_blank">vadim.shovkoplias@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Bugzilla: <a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440" rel="noreferrer" target="_blank">https://bugs.freedesktop.org/<wbr>show_bug.cgi?id=105440</a><br> Fixes: 2458ea95c56 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible"<br> Signed-off-by: Andriy Khulap <<a href="mailto:andriy.khulap@globallogic.com">andriy.khulap@globallogic.com</a><wbr>><br> Signed-off-by: Vadym Shovkoplias <<a href="mailto:vadym.shovkoplias@globallogic.com">vadym.shovkoplias@<wbr>globallogic.com</a>><br> ---<br> src/compiler/nir/nir_lower_<wbr>vec_to_movs.c | 7 ++++++-<br> 1 file changed, 6 insertions(+), 1 deletion(-)<br> <br> diff --git a/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c b/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c<br> index 711ddd3..4758f7d 100644<br> --- a/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c<br> +++ b/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c<br> @@ -166,9 +166,14 @@ try_coalesce(nir_alu_instr *vec, unsigned start_idx)<br> /* If we are going to reswizzle the instruction, we can't have any<br> * non-per-component sources either.<br> */<br> - for (unsigned j = 0; j < nir_op_infos[src_alu->op].num_<wbr>inputs; j++)<br> + for (unsigned j = 0; j < nir_op_infos[src_alu->op].num_<wbr>inputs; j++) {<br> if (nir_op_infos[src_alu->op].<wbr>input_sizes[j] != 0)<br> return 0;<br> +<br> + /* Don't coalesce the mmove when src and dest are the same reg */<br> + if (src_matches_dest_reg(&vec-><wbr>dest.dest, &src_alu->src[j].src))<br> + return 0;<br> + }<br> }<br> <br> /* Stash off all of the ALU instruction's swizzles. */<br> <span class="HOEnZb"><font color="#888888">--<br> 2.7.4<br> <br> ______________________________<wbr>_________________<br> mesa-dev mailing list<br> <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br> </font></span></blockquote></div><br></div>