<div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Tue, Dec 4, 2018 at 1:18 AM Iago Toral Quiroga <<a href="mailto:itoral@igalia.com">itoral@igalia.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The implementation of these opcodes in the generator assumes that their<br>
arguments are packed, and it generates register regions based on that<br>
assumption. While this expectation is reasonable for 32-bit,</blockquote><div><br></div><div>Expectation, sure, but if someone does ddx(f2f32(d)) where d is a double, it's broken. Maybe we should back-port? Either way</div><div><br></div><div>Reviewed-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> when we<br>
load 16-bit elements from UBOs we get them with a stride of 2 that we<br>
then need to pack with a stride of 1. Copy propagation can see through this<br>
and rewrite ddx/ddy operands to use the original, strided register, breaking<br>
the implementation in the generator.<br>
---<br>
.../compiler/brw_fs_copy_propagation.cpp | 21 +++++++++++++++++++<br>
1 file changed, 21 insertions(+)<br>
<br>
diff --git a/src/intel/compiler/brw_fs_copy_propagation.cpp b/src/intel/compiler/brw_fs_copy_propagation.cpp<br>
index 58d5080b4e9..c01d4ec4a4f 100644<br>
--- a/src/intel/compiler/brw_fs_copy_propagation.cpp<br>
+++ b/src/intel/compiler/brw_fs_copy_propagation.cpp<br>
@@ -361,6 +361,20 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned stride,<br>
return true;<br>
}<br>
<br>
+static bool<br>
+instruction_requires_packed_data(fs_inst *inst)<br>
+{<br>
+ switch (inst->opcode) {<br>
+ case FS_OPCODE_DDX_FINE:<br>
+ case FS_OPCODE_DDX_COARSE:<br>
+ case FS_OPCODE_DDY_FINE:<br>
+ case FS_OPCODE_DDY_COARSE:<br>
+ return true;<br>
+ default:<br>
+ return false;<br>
+ }<br>
+}<br>
+<br>
bool<br>
fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)<br>
{<br>
@@ -407,6 +421,13 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)<br>
inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_WRITE)<br>
return false;<br>
<br>
+ /* Some instructions implemented in the generator backend, such as<br>
+ * derivatives, assume that their operands are packed so we can't<br>
+ * generally propagate strided regions to them.<br>
+ */<br>
+ if (instruction_requires_packed_data(inst) && entry->src.stride > 1)<br>
+ return false;<br>
+<br>
/* Bail if the result of composing both strides would exceed the<br>
* hardware limit.<br>
*/<br>
-- <br>
2.17.1<br>
<br>
_______________________________________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</blockquote></div></div>