[Mesa-dev] [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().
Timothy Arceri
tarceri at itsqueeze.com
Sat Apr 22 00:28:49 UTC 2017
I don't entirely understand how this works, but it seems reasonable to me.
Acked-by: Timothy Arceri <tarceri at itsqueeze.com>
Thanks for fixing this :)
On 21/04/17 19:13, Kenneth Graunke wrote:
> opt_register_coalesce() was optimizing sequences such as:
>
> mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
> mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D
> mov(8) m4.zw:F, vgrf5.xxxy:F
>
> into:
>
> mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
> mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D
>
> This doesn't work - if we're going to reswizzle MACH, we'd need to
> reswizzle the MUL as well. Here, the MUL fills the accumulator's .zw
> components with attr18.yy * attr19.yy. But the MACH instruction expects
> .z to contain attr18.x * attr19.x. Bogus results ensue.
>
> No change in shader-db on Haswell. Prevents regressions in Timothy's
> patches to use enhanced layouts for varying packing (which rearrange
> code just enough to trigger this pre-existing bug, but were fine
> themselves).
>
> Cc: Timothy Arceri <tarceri at itsqueeze.com>
> ---
> src/intel/compiler/brw_vec4.cpp | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
> index 0b92ba704e5..4bb774bf10e 100644
> --- a/src/intel/compiler/brw_vec4.cpp
> +++ b/src/intel/compiler/brw_vec4.cpp
> @@ -1071,6 +1071,13 @@ vec4_instruction::can_reswizzle(const struct gen_device_info *devinfo,
> if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW)
> return false;
>
> + /* Don't touch MACH - it uses the accumulator results from an earlier
> + * MUL - so we'd need to reswizzle both. We don't do that, so just
> + * avoid it entirely.
> + */
> + if (opcode == BRW_OPCODE_MACH)
> + return false;
> +
> if (!can_do_writemask(devinfo) && dst_writemask != WRITEMASK_XYZW)
> return false;
>
>
More information about the mesa-dev
mailing list