<div dir="ltr">Tested on Haswell. Patch works well for me, thanks!<br><br><span class="gmail-il">Tested</span>-<span class="gmail-il">by</span>: Vadym Shovkoplias <<a href="mailto:vadym.shovkoplias@globallogic.com">vadym.shovkoplias@globallogic.com</a>><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Mar 23, 2018 at 8:35 PM, Jason Ekstrand <span dir="ltr"><<a href="mailto:jason@jlekstrand.net" target="_blank">jason@jlekstrand.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">Otherwise we may end up trying to coalesce in a case such as<br>
<br>
ssa_1 = fadd r1, r2<br>
r3.x = fneg(r2);<br>
r3 = vec4(ssa_1, ssa_1.y, ...)<br>
<br>
and that would cause us to move the writes to r3 from the vec to the<br>
fadd which would re-order them with respect to the write from the fneg.<br>
In order to solve this, we just don't coalesce if the destination of the<br>
vec is not SSA.  We could try to get clever and still coalesce if there<br>
are no writes to the destination of the vec between the vec and the ALU<br>
source.  However, since registers only come from phi webs and indirects,<br>
the chances of having a vec with a register destination that is actually<br>
coalescable into its source is very slim.<br>
<br>
Bugzilla: <a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440" rel="noreferrer" target="_blank">https://bugs.freedesktop.org/<wbr>show_bug.cgi?id=105440</a><br>
Fixes: 2458ea95c56 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible"<br>
Reported-by: Vadym Shovkoplias <<a href="mailto:vadym.shovkoplias@globallogic.com">vadym.shovkoplias@<wbr>globallogic.com</a>><br>
Cc: Andriy Khulap <<a href="mailto:andriy.khulap@globallogic.com">andriy.khulap@globallogic.com</a><wbr>><br>
Cc: Vadym Shovkoplias <<a href="mailto:vadym.shovkoplias@globallogic.com">vadym.shovkoplias@<wbr>globallogic.com</a>><br>
---<br>
 src/compiler/nir/nir_lower_<wbr>vec_to_movs.c | 7 ++++++-<br>
 1 file changed, 6 insertions(+), 1 deletion(-)<br>
<br>
diff --git a/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c b/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c<br>
index 711ddd3..8b24376 100644<br>
--- a/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c<br>
+++ b/src/compiler/nir/nir_lower_<wbr>vec_to_movs.c<br>
@@ -230,6 +230,7 @@ lower_vec_to_movs_block(nir_<wbr>block *block, nir_function_impl *impl)<br>
          continue; /* The loop */<br>
       }<br>
<br>
+      bool vec_had_ssa_dest = vec->dest.dest.is_ssa;<br>
       if (vec->dest.dest.is_ssa) {<br>
          /* Since we insert multiple MOVs, we have a register destination. */<br>
          nir_register *reg = nir_local_reg_create(impl);<br>
@@ -263,7 +264,11 @@ lower_vec_to_movs_block(nir_<wbr>block *block, nir_function_impl *impl)<br>
          if (!(vec->dest.write_mask & (1 << i)))<br>
             continue;<br>
<br>
-         if (!(finished_write_mask & (1 << i)))<br>
+         /* Coalescing moves the register writes from the vec up to the ALU<br>
+          * instruction in the source.  We can only do this if the original<br>
+          * vecN had an SSA destination.<br>
+          */<br>
+         if (vec_had_ssa_dest && !(finished_write_mask & (1 << i)))<br>
             finished_write_mask |= try_coalesce(vec, i);<br>
<br>
          if (!(finished_write_mask & (1 << i)))<br>
--<br>
2.5.0.400.gff86faf<br>
<br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><font size="-1"><br><span style="vertical-align:baseline;font-variant:normal;font-style:normal;font-size:12px;background-color:transparent;text-decoration:none;font-family:Arial;font-weight:bold">Vadym Shovkoplias | Senior Software Engineer</span><br><span style="vertical-align:baseline;font-variant:normal;font-style:normal;font-size:12px;background-color:transparent;text-decoration:none;font-family:Arial;font-weight:normal">GlobalLogic</span><br><span style="vertical-align:baseline;font-variant:normal;font-style:normal;font-size:12px;background-color:transparent;text-decoration:none;font-family:Arial;font-weight:normal"></span></font><font size="-1"><span style="vertical-align:baseline;font-variant:normal;font-style:normal;font-size:12px;background-color:transparent;text-decoration:none;font-family:Arial;font-weight:normal"><span><font color="#888888"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:12.8px">P </span><a style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:12.8px">+380.57.766.7667</a></font></span>  M +3.8050.931.7304  S vadym.shovkoplias</span><br><a href="http://www.globallogic.com/" target="_blank"><span style="font-size:12px;font-family:Arial;color:#1155cc;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline">www.globallogic.com</span></a><span style="vertical-align:baseline;font-variant:normal;font-style:normal;font-size:12px;background-color:transparent;text-decoration:none;font-family:Arial;font-weight:normal"></span><br><a href="http://www.globallogic.com/" target="_blank"><span style="font-size:12px;font-family:Arial;color:#1155cc;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline"></span></a><br><a href="http://www.globallogic.com/email_disclaimer.txt" target="_blank"><span style="font-size:11px;font-family:Arial;color:#1155cc;background-color:transparent;font-weight:normal;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline">http://www.globallogic.com/email_disclaimer.txt</span></a><span style="vertical-align:baseline;font-variant:normal;font-style:normal;font-size:11px;background-color:transparent;text-decoration:none;font-family:Arial;font-weight:normal"></span></font></div></div></div></div></div></div></div>
</div>