[Mesa-dev] [PATCH 2/2] intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5.

Fri Aug 3 02:57:27 UTC 2018

On Thu, Aug 2, 2018 at 6:52 PM Kenneth Graunke <kenneth at whitecape.org>
wrote:

> When the SIMD16 Gen4-5 fragment shader payload contains source depth
> (g2-3), destination stencil (g4), and destination depth (g5-6), the
> single register of stencil makes the destination depth unaligned.
>
> We were generating this instruction in the RT write payload setup:
>
>    mov(16)   m14<1>F   g5<8,8,1>F   { align1 compr };
>
> which is illegal, instructions with a source region spanning more than
> one register need to be aligned to even registers.  This is because the
> hardware implicitly does (nr | 1) instead of (nr + 1) when splitting the
> compressed instruction into two mov(8)'s.
>

This seems like something worth validating.  Also, gen4 so maybe meh?

One nit below.  With that fixed and with or without validation, both are

Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

> I believe this would cause the hardware to load g5 twice, replicating
> subspan 0-1's destination depth to subspan 2-3.  This showed up as 2x2
> artifact blocks in both TIS-100 and Reicast.
>
> Normally, we rely on the register allocator to even-align our virtual
> GRFs.  But we don't control the payload, so we need to lower SIMD widths
> to make it work.  To fix this, we teach lower_simd_width about the
> restriction, and then call it again after lower_load_payload (which is
> what generates the offending MOV).
>
> Fixes: 8aee87fe4cce0a883867df3546db0e0a36908086 (i965: Use SIMD16 instead
> of SIMD8 on Gen4 when possible.)
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107212
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=13728
> Tested-by: Diego Viola <diego.viola at gmail.com>
> ---
>  src/intel/compiler/brw_fs.cpp | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 20b89035e1f..78dafd9b706 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -5115,6 +5115,25 @@ get_fpu_lowered_simd_width(const struct
> gen_device_info *devinfo,
>        }
>     }
>
> +   if (devinfo->gen < 6) {
> +      /* From the G45 PRM, Volume 4 Page 361:
> +       *
> +       *    "Operand Alignment Rule: With the exceptions listed below, a
> +       *     source/destination operand in general should be aligned to
> even
> +       *     256-bit physical register with a region size equal to two
> 256-bit
> +       *     physical registers."
> +       *
> +       * Normally we enforce this by allocating virtual registers to the
> +       * even-aligned class.  But we need to handle payload registers.
> +       */
> +      for (unsigned i = 0; i < inst->sources; i++) {
> +         if (inst->src[i].file == FIXED_GRF && (inst->src[i].nr & 1) &&
> +             inst->size_read(i) >= 2 * REG_SIZE) {
>

How about just inst->size_read(i) > REG_SIZE so that it's "more than one"
rather than "at least two"?

> +            max_width = MIN2(max_width, 8);
> +         }
> +      }
> +   }
> +
>     /* From the IVB PRMs:
>      *  "When an instruction is SIMD32, the low 16 bits of the execution
> mask
>      *   are applied for both halves of the SIMD32 instruction. If
> different
> @@ -6321,6 +6340,7 @@ fs_visitor::optimize()
>     if (OPT(lower_load_payload)) {
>        split_virtual_grfs();
>        OPT(register_coalesce);
> +      OPT(lower_simd_width);
>        OPT(compute_to_mrf);
>        OPT(dead_code_eliminate);
>     }
> --
> 2.18.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180802/5c2bbfdb/attachment.html>