<div dir="ltr"><div><div class="gmail_quote"><div dir="ltr">On Thu, Aug 2, 2018 at 6:52 PM Kenneth Graunke <<a href="mailto:kenneth@whitecape.org">kenneth@whitecape.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">When the SIMD16 Gen4-5 fragment shader payload contains source depth<br> (g2-3), destination stencil (g4), and destination depth (g5-6), the<br> single register of stencil makes the destination depth unaligned.<br> <br> We were generating this instruction in the RT write payload setup:<br> <br> mov(16) m14<1>F g5<8,8,1>F { align1 compr };<br> <br> which is illegal, instructions with a source region spanning more than<br> one register need to be aligned to even registers. This is because the<br> hardware implicitly does (nr | 1) instead of (nr + 1) when splitting the<br> compressed instruction into two mov(8)'s.<br></blockquote><div><br></div><div>This seems like something worth validating. Also, gen4 so maybe meh?</div><div><br></div><div>One nit below. With that fixed and with or without validation, both are</div><div><br></div><div>Reviewed-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> I believe this would cause the hardware to load g5 twice, replicating<br> subspan 0-1's destination depth to subspan 2-3. This showed up as 2x2<br> artifact blocks in both TIS-100 and Reicast.<br> <br> Normally, we rely on the register allocator to even-align our virtual<br> GRFs. But we don't control the payload, so we need to lower SIMD widths<br> to make it work. To fix this, we teach lower_simd_width about the<br> restriction, and then call it again after lower_load_payload (which is<br> what generates the offending MOV).<br> <br> Fixes: 8aee87fe4cce0a883867df3546db0e0a36908086 (i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.)<br> Bugzilla: <a href="https://bugs.freedesktop.org/show_bug.cgi?id=107212" rel="noreferrer" target="_blank">https://bugs.freedesktop.org/show_bug.cgi?id=107212</a><br> Bugzilla: <a href="https://bugs.freedesktop.org/show_bug.cgi?id=13728" rel="noreferrer" target="_blank">https://bugs.freedesktop.org/show_bug.cgi?id=13728</a><br> Tested-by: Diego Viola <<a href="mailto:diego.viola@gmail.com" target="_blank">diego.viola@gmail.com</a>><br> ---<br> src/intel/compiler/brw_fs.cpp | 20 ++++++++++++++++++++<br> 1 file changed, 20 insertions(+)<br> <br> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp<br> index 20b89035e1f..78dafd9b706 100644<br> --- a/src/intel/compiler/brw_fs.cpp<br> +++ b/src/intel/compiler/brw_fs.cpp<br> @@ -5115,6 +5115,25 @@ get_fpu_lowered_simd_width(const struct gen_device_info *devinfo,<br> }<br> }<br> <br> + if (devinfo->gen < 6) {<br> + /* From the G45 PRM, Volume 4 Page 361:<br> + *<br> + * "Operand Alignment Rule: With the exceptions listed below, a<br> + * source/destination operand in general should be aligned to even<br> + * 256-bit physical register with a region size equal to two 256-bit<br> + * physical registers."<br> + *<br> + * Normally we enforce this by allocating virtual registers to the<br> + * even-aligned class. But we need to handle payload registers.<br> + */<br> + for (unsigned i = 0; i < inst->sources; i++) {<br> + if (inst->src[i].file == FIXED_GRF && (inst->src[i].nr & 1) &&<br> + inst->size_read(i) >= 2 * REG_SIZE) {<br></blockquote><div><br></div><div>How about just inst->size_read(i) > REG_SIZE so that it's "more than one" rather than "at least two"?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> + max_width = MIN2(max_width, 8);<br> + }<br> + }<br> + }<br> +<br> /* From the IVB PRMs:<br> * "When an instruction is SIMD32, the low 16 bits of the execution mask<br> * are applied for both halves of the SIMD32 instruction. If different<br> @@ -6321,6 +6340,7 @@ fs_visitor::optimize()<br> if (OPT(lower_load_payload)) {<br> split_virtual_grfs();<br> OPT(register_coalesce);<br> + OPT(lower_simd_width);<br> OPT(compute_to_mrf);<br> OPT(dead_code_eliminate);<br> }<br> -- <br> 2.18.0<br> <br> _______________________________________________<br> mesa-dev mailing list<br> <a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br> </blockquote></div></div></div>