<p dir="ltr"><br>
On Sep 22, 2015 10:01 PM, "Jason Ekstrand" <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br>
><br>
> It's possible that, if a vecN operation is involved in a phi node, that we<br>
> could end up moving from a register to itself. If swizzling is involved,<br>
> we need to emit the move but. However, if there is no swizzling, then the<br>
> mov is a no-op and we might as well not bother emitting it.<br>
><br>
> Shader-db results on Haswell:<br>
><br>
> total instructions in shared programs: 6262536 -> 6259558 (-0.05%)<br>
> instructions in affected programs: 184780 -> 181802 (-1.61%)<br>
> helped: 838<br>
> HURT: 0</p>
<p dir="ltr">By the way, I have absolutely no idea why this helps more than Matt's patch to delete these in register coalesce.</p>
<p dir="ltr">> ---<br>
> src/glsl/nir/nir_lower_vec_to_movs.c | 19 ++++++++++++++++++-<br>
> 1 file changed, 18 insertions(+), 1 deletion(-)<br>
><br>
> diff --git a/src/glsl/nir/nir_lower_vec_to_movs.c b/src/glsl/nir/nir_lower_vec_to_movs.c<br>
> index 287f2bf..2039891 100644<br>
> --- a/src/glsl/nir/nir_lower_vec_to_movs.c<br>
> +++ b/src/glsl/nir/nir_lower_vec_to_movs.c<br>
> @@ -83,7 +83,24 @@ insert_mov(nir_alu_instr *vec, unsigned start_idx, nir_shader *shader)<br>
> }<br>
> }<br>
><br>
> - nir_instr_insert_before(&vec->instr, &mov->instr);<br>
> + /* In some situations (if the vecN is involved in a phi-web), we can end<br>
> + * up with a mov from a register to itself. Some of those channels may end<br>
> + * up doing nothing and there's no reason to have them as part of the mov.<br>
> + */<br>
> + if (src_matches_dest_reg(&mov->dest.dest, &mov->src[0].src) &&<br>
> + !mov->src[0].abs && !mov->src[0].negate) {<br>
> + for (unsigned i = 0; i < 4; i++) {<br>
> + if (mov->src[0].swizzle[i] == i)<br>
> + mov->dest.write_mask &= ~(1 << i);<br>
> + }<br>
> + }<br>
> +<br>
> + /* Only emit the instruction if it actually does something */<br>
> + if (mov->dest.write_mask) {<br>
> + nir_instr_insert_before(&vec->instr, &mov->instr);<br>
> + } else {<br>
> + ralloc_free(mov);<br>
> + }<br>
><br>
> return mov->dest.write_mask;<br>
> }<br>
> --<br>
> 2.5.0.400.gff86faf<br>
><br>
</p>