[Mesa-dev] [PATCH 01/12] i965/fs: Unpack count argument to 64-bit shift ops on Atom
Andres Gomez
agomez at igalia.com
Tue Nov 28 13:54:59 UTC 2017
Matt, this looks like a good candidate to nominate for inclusion in
the 17.2 stable queue.
What do you think?
On Thu, 2017-09-28 at 23:05 -0700, Matt Turner wrote:
> 64-bit operations on Atom parts have additional restrictions over their
> big-core counterparts (validated by later patches).
>
> Specifically, the restriction that "Source and Destination horizontal
> stride must be aligned to the same qword" is violated by most shift
> operations since NIR uses a 32-bit value as the shift count argument,
> and this causes instructions like
>
> shl(8) g19<1>Q g5<4,4,1>Q g23<4,4,1>UD
>
> where src1 has a 32-bit stride, but the dest and src0 have a 64-bit
> stride.
>
> This caused ~4 pixels in the ARB_shader_ballot piglit test
> fs-readInvocation-uint.shader_test to be incorrect. Unfortunately no
> ARB_gpu_shader_int64 test hit this case because they operate on
> uniforms, and their scalar regions are an exception to the restriction.
>
> We work around this by effectively unpacking the shift count, so that we
> can read it with a 64-bit stride in the shift instruction. Unfortunately
> the unpack (a MOV with a dst stride of 2) is a partial write, and cannot
> be copy-propagated or CSE'd.
> ---
> src/intel/compiler/brw_fs_nir.cpp | 34 ++++++++++++++++++++++++++++------
> 1 file changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp
> index d760946e62..8ea4297ceb 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -1267,14 +1267,36 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr)
> unreachable("not reached: should have been lowered");
>
> case nir_op_ishl:
> - bld.SHL(result, op[0], op[1]);
> - break;
> case nir_op_ishr:
> - bld.ASR(result, op[0], op[1]);
> - break;
> - case nir_op_ushr:
> - bld.SHR(result, op[0], op[1]);
> + case nir_op_ushr: {
> + fs_reg shift_count = op[1];
> +
> + if (devinfo->is_cherryview || gen_device_info_is_9lp(devinfo)) {
> + if (op[1].file == VGRF &&
> + (result.type == BRW_REGISTER_TYPE_Q ||
> + result.type == BRW_REGISTER_TYPE_UQ)) {
> + shift_count = fs_reg(VGRF, alloc.allocate(dispatch_width / 4),
> + BRW_REGISTER_TYPE_UD);
> + shift_count.stride = 2;
> + bld.MOV(shift_count, op[1]);
> + }
> + }
> +
> + switch (instr->op) {
> + case nir_op_ishl:
> + bld.SHL(result, op[0], shift_count);
> + break;
> + case nir_op_ishr:
> + bld.ASR(result, op[0], shift_count);
> + break;
> + case nir_op_ushr:
> + bld.SHR(result, op[0], shift_count);
> + break;
> + default:
> + unreachable("not reached");
> + }
> break;
> + }
>
> case nir_op_pack_half_2x16_split:
> bld.emit(FS_OPCODE_PACK_HALF_2x16_SPLIT, result, op[0], op[1]);
--
Br,
Andres
More information about the mesa-dev
mailing list