[Mesa-dev] [PATCH 1/2] i965/fs: Fix single-precision to double-precision conversions for CHV/BSW
Kenneth Graunke
kenneth at whitecape.org
Fri Jun 17 06:41:42 UTC 2016
On Wednesday, June 15, 2016 9:25:44 AM PDT Samuel Iglesias Gonsálvez wrote:
> From: Iago Toral Quiroga <itoral at igalia.com>
>
> From the Cherryview PRM, Volume 7, 3D Media GPGPU Engine,
> Register Region Restrictions:
>
> "When source or destination is 64b (...), regioning in Align1
> must follow these rules:
>
> 1. Source and destination horizontal stride must be aligned to
> the same qword.
> (...)"
>
> Cc: "12.0" <mesa-stable at lists.freedesktop.org>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
> ---
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index b811953..c271e64 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -715,10 +715,37 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr)
> case nir_op_u2f:
> if (optimize_extract_to_float(instr, result))
> return;
> + inst = bld.MOV(result, op[0]);
> + inst->saturate = instr->dest.saturate;
> + break;
>
> case nir_op_f2d:
> case nir_op_i2d:
> case nir_op_u2d:
> + /* CHV PRM, vol07, 3D Media GPGPU Engine, Register Region Restrictions:
> + *
> + * "When source or destination is 64b (...), regioning in Align1
> + * must follow these rules:
> + *
> + * 1. Source and destination horizontal stride must be aligned to
> + * the same qword.
> + * (...)"
> + *
> + * This means that 32-bit to 64-bit conversions need to have the 32-bit
> + * data elements aligned to 64-bit. This restriction does not apply to
> + * BDW and later.
> + */
> + if (devinfo->is_cherryview) {
> + fs_reg tmp = bld.vgrf(result.type, 1);
> + tmp = subscript(tmp, op[0].type, 0);
> + inst = bld.MOV(tmp, op[0]);
> + inst->regs_written =
> + inst->dst.component_size(bld.dispatch_width()) / REG_SIZE;
As we discussed on #intel-gfx...this line isn't necessary.
fs_inst::init() initializes regs_written to:
DIV_ROUND_UP(dst.component_size(exec_size), REG_SIZE);
where exec_size is initialized to dispatch_width in this case.
So the default calculation for component_size() works out to:
MAX2(channels * 2 [stride], 1) * type_sz(D) = channels * 2 * 4 = 64
while your new one is:
MAX2(channels * 1 [stride], 1) * type_sz(DF) = channels * 1 * 8 = 64
So they're equivalent.
With that line removed, this patch is:
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
> + inst = bld.MOV(result, tmp);
> + inst->saturate = instr->dest.saturate;
> + break;
> + }
> + /* fallthrough */
> case nir_op_d2f:
> case nir_op_d2i:
> case nir_op_d2u:
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160616/97009781/attachment.sig>
More information about the mesa-dev
mailing list