[Mesa-dev] [PATCH 2/2] intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5.
Jason Ekstrand
jason at jlekstrand.net
Fri Aug 3 02:57:27 UTC 2018
On Thu, Aug 2, 2018 at 6:52 PM Kenneth Graunke <kenneth at whitecape.org>
wrote:
> When the SIMD16 Gen4-5 fragment shader payload contains source depth
> (g2-3), destination stencil (g4), and destination depth (g5-6), the
> single register of stencil makes the destination depth unaligned.
>
> We were generating this instruction in the RT write payload setup:
>
> mov(16) m14<1>F g5<8,8,1>F { align1 compr };
>
> which is illegal, instructions with a source region spanning more than
> one register need to be aligned to even registers. This is because the
> hardware implicitly does (nr | 1) instead of (nr + 1) when splitting the
> compressed instruction into two mov(8)'s.
>
This seems like something worth validating. Also, gen4 so maybe meh?
One nit below. With that fixed and with or without validation, both are
Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
> I believe this would cause the hardware to load g5 twice, replicating
> subspan 0-1's destination depth to subspan 2-3. This showed up as 2x2
> artifact blocks in both TIS-100 and Reicast.
>
> Normally, we rely on the register allocator to even-align our virtual
> GRFs. But we don't control the payload, so we need to lower SIMD widths
> to make it work. To fix this, we teach lower_simd_width about the
> restriction, and then call it again after lower_load_payload (which is
> what generates the offending MOV).
>
> Fixes: 8aee87fe4cce0a883867df3546db0e0a36908086 (i965: Use SIMD16 instead
> of SIMD8 on Gen4 when possible.)
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107212
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=13728
> Tested-by: Diego Viola <diego.viola at gmail.com>
> ---
> src/intel/compiler/brw_fs.cpp | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 20b89035e1f..78dafd9b706 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -5115,6 +5115,25 @@ get_fpu_lowered_simd_width(const struct
> gen_device_info *devinfo,
> }
> }
>
> + if (devinfo->gen < 6) {
> + /* From the G45 PRM, Volume 4 Page 361:
> + *
> + * "Operand Alignment Rule: With the exceptions listed below, a
> + * source/destination operand in general should be aligned to
> even
> + * 256-bit physical register with a region size equal to two
> 256-bit
> + * physical registers."
> + *
> + * Normally we enforce this by allocating virtual registers to the
> + * even-aligned class. But we need to handle payload registers.
> + */
> + for (unsigned i = 0; i < inst->sources; i++) {
> + if (inst->src[i].file == FIXED_GRF && (inst->src[i].nr & 1) &&
> + inst->size_read(i) >= 2 * REG_SIZE) {
>
How about just inst->size_read(i) > REG_SIZE so that it's "more than one"
rather than "at least two"?
> + max_width = MIN2(max_width, 8);
> + }
> + }
> + }
> +
> /* From the IVB PRMs:
> * "When an instruction is SIMD32, the low 16 bits of the execution
> mask
> * are applied for both halves of the SIMD32 instruction. If
> different
> @@ -6321,6 +6340,7 @@ fs_visitor::optimize()
> if (OPT(lower_load_payload)) {
> split_virtual_grfs();
> OPT(register_coalesce);
> + OPT(lower_simd_width);
> OPT(compute_to_mrf);
> OPT(dead_code_eliminate);
> }
> --
> 2.18.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180802/5c2bbfdb/attachment.html>
More information about the mesa-dev
mailing list