<p dir="ltr">As I said on patch 5, I would like to see some version of it merged at least for fs. The vec4 back-end isn't as much of a problem since we've verified it now and future hardware won't be using it.</p>
<p dir="ltr">Series is Reviewed-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>></p>
<div class="gmail_extra"><br><div class="gmail_quote">On Sep 16, 2016 3:04 PM, "Francisco Jerez" <<a href="mailto:currojerez@riseup.net">currojerez@riseup.net</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">This avoids emitting a few extra instructions required to take the<br>
dispatch mask into account when it's known to be tightly packed.<br>
---<br>
src/mesa/drivers/dri/i965/brw_<wbr>fs_generator.cpp | 4 +++-<br>
src/mesa/drivers/dri/i965/brw_<wbr>vec4_generator.cpp | 8 ++++++--<br>
2 files changed, 9 insertions(+), 3 deletions(-)<br>
<br>
diff --git a/src/mesa/drivers/dri/i965/<wbr>brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/<wbr>brw_fs_generator.cpp<br>
index c510f42..bdeda3b 100644<br>
--- a/src/mesa/drivers/dri/i965/<wbr>brw_fs_generator.cpp<br>
+++ b/src/mesa/drivers/dri/i965/<wbr>brw_fs_generator.cpp<br>
@@ -2045,7 +2045,9 @@ fs_generator::generate_code(<wbr>const cfg_t *cfg, int dispatch_width)<br>
<br>
case SHADER_OPCODE_FIND_LIVE_<wbr>CHANNEL: {<br>
const struct brw_reg mask =<br>
- stage == MESA_SHADER_FRAGMENT ? brw_vmask_reg() : brw_dmask_reg();<br>
+ brw_stage_has_packed_dispatch(<wbr>stage, prog_data) ? brw_imm_ud(~0u) :<br>
+ stage == MESA_SHADER_FRAGMENT ? brw_vmask_reg() :<br>
+ brw_dmask_reg();<br>
brw_find_live_channel(p, dst, mask);<br>
break;<br>
}<br>
diff --git a/src/mesa/drivers/dri/i965/<wbr>brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/<wbr>brw_vec4_generator.cpp<br>
index f9e6d1c..2bef549 100644<br>
--- a/src/mesa/drivers/dri/i965/<wbr>brw_vec4_generator.cpp<br>
+++ b/src/mesa/drivers/dri/i965/<wbr>brw_vec4_generator.cpp<br>
@@ -1862,9 +1862,13 @@ generate_code(struct brw_codegen *p,<br>
brw_memory_fence(p, dst);<br>
break;<br>
<br>
- case SHADER_OPCODE_FIND_LIVE_<wbr>CHANNEL:<br>
- brw_find_live_channel(p, dst, brw_dmask_reg());<br>
+ case SHADER_OPCODE_FIND_LIVE_<wbr>CHANNEL: {<br>
+ const struct brw_reg mask =<br>
+ brw_stage_has_packed_dispatch(<wbr>nir->stage, &prog_data->base) ?<br>
+ brw_imm_ud(~0u) : brw_dmask_reg();<br>
+ brw_find_live_channel(p, dst, mask);<br>
break;<br>
+ }<br>
<br>
case SHADER_OPCODE_BROADCAST:<br>
assert(inst->force_writemask_<wbr>all);<br>
--<br>
2.9.0<br>
<br>
</blockquote></div></div>