[Mesa-dev] [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().

Francisco Jerez currojerez at riseup.net
Sat Apr 22 17:04:54 UTC 2017


Kenneth Graunke <kenneth at whitecape.org> writes:

> opt_register_coalesce() was optimizing sequences such as:
>
>    mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
>    mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D
>    mov(8) m4.zw:F, vgrf5.xxxy:F
>
> into:
>
>    mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
>    mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D
>
> This doesn't work - if we're going to reswizzle MACH, we'd need to
> reswizzle the MUL as well.  Here, the MUL fills the accumulator's .zw
> components with attr18.yy * attr19.yy.  But the MACH instruction expects
> .z to contain attr18.x * attr19.x.  Bogus results ensue.
>
> No change in shader-db on Haswell.  Prevents regressions in Timothy's
> patches to use enhanced layouts for varying packing (which rearrange
> code just enough to trigger this pre-existing bug, but were fine
> themselves).
>
> Cc: Timothy Arceri <tarceri at itsqueeze.com>

It might be a good idea to CC mesa-stable.

> ---
>  src/intel/compiler/brw_vec4.cpp | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
> index 0b92ba704e5..4bb774bf10e 100644
> --- a/src/intel/compiler/brw_vec4.cpp
> +++ b/src/intel/compiler/brw_vec4.cpp
> @@ -1071,6 +1071,13 @@ vec4_instruction::can_reswizzle(const struct gen_device_info *devinfo,
>     if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW)
>        return false;
>  
> +   /* Don't touch MACH - it uses the accumulator results from an earlier
> +    * MUL - so we'd need to reswizzle both.  We don't do that, so just
> +    * avoid it entirely.
> +    */
> +   if (opcode == BRW_OPCODE_MACH)
> +      return false;
> +

I think the ultimate problem here is that MACH (among other
instructions) has an implicit accumulator source, which has no swizzle
controls, so it won't be able to represent the composition of swizzles
(for non-identity swizzles that is).  I think this should check for
reads_accumulator_implicitly() instead in order to avoid the exact same
problem with other instructions with implicit accumulator access.

>     if (!can_do_writemask(devinfo) && dst_writemask != WRITEMASK_XYZW)
>        return false;
>  
> -- 
> 2.12.2
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170422/907e07ab/attachment-0001.sig>


More information about the mesa-dev mailing list