[Mesa-dev] [PATCH 12/17] intel/compiler/fs: Implement ddy without using align16 for Gen11+
Kenneth Graunke
kenneth at whitecape.org
Sat Feb 24 00:42:26 UTC 2018
On Tuesday, February 20, 2018 9:15:19 PM PST Matt Turner wrote:
> Align16 is no more. We previously generated an align16 ADD instruction
> to calculate DDY:
>
> add(8) g11<1>F -g10<4>.xyxyF g10<4>.zwzwF { align16 1Q };
>
> Without align16, we now implement it as two align1 instructions:
>
> add(4) g11<2>F -g10<4,2,0>F g10.2<4,2,0>F { align1 1N };
> add(4) g11.1<2>F -g10.1<4,2,0>F g10.3<4,2,0>F { align1 1N };
> ---
> src/intel/compiler/brw_fs_generator.cpp | 70 ++++++++++++++++++++++++++-------
> 1 file changed, 56 insertions(+), 14 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs_generator.cpp b/src/intel/compiler/brw_fs_generator.cpp
> index 013d2c820a0..ffc46972420 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -1192,23 +1192,65 @@ fs_generator::generate_ddy(const fs_inst *inst,
> {
> if (inst->opcode == FS_OPCODE_DDY_FINE) {
> /* produce accurate derivatives */
> - struct brw_reg src0 = src;
> - struct brw_reg src1 = src;
> + if (devinfo->gen >= 11) {
> + struct brw_reg x = src;
> + struct brw_reg y = src;
> + struct brw_reg z = src;
> + struct brw_reg w = src;
> + struct brw_reg dst_e = dst;
> + struct brw_reg dst_o = dst;
Maybe call these dst_even and dst_odd?
Or perhaps dst_e = dst /* even channels */?
> +
> + x.vstride = BRW_VERTICAL_STRIDE_4;
> + y.vstride = BRW_VERTICAL_STRIDE_4;
> + z.vstride = BRW_VERTICAL_STRIDE_4;
> + w.vstride = BRW_VERTICAL_STRIDE_4;
> +
> + x.width = BRW_WIDTH_2;
> + y.width = BRW_WIDTH_2;
> + z.width = BRW_WIDTH_2;
> + w.width = BRW_WIDTH_2;
> +
> + x.hstride = BRW_HORIZONTAL_STRIDE_0;
> + y.hstride = BRW_HORIZONTAL_STRIDE_0;
> + z.hstride = BRW_HORIZONTAL_STRIDE_0;
> + w.hstride = BRW_HORIZONTAL_STRIDE_0;
If you like, you could drop some wordiness by doing:
struct brw_reg x = stride(src, 4, 2, 0);
struct brw_reg y = stride(src, 4, 2, 0);
struct brw_reg z = stride(src, 4, 2, 0);
struct brw_reg w = stride(src, 4, 2, 0);
> +
> + x.subnr = 0 * sizeof(float);
> + y.subnr = 1 * sizeof(float);
> + z.subnr = 2 * sizeof(float);
> + w.subnr = 3 * sizeof(float);
> +
With or without any suggestions, this patch is:
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180223/b2fcc777/attachment.sig>
More information about the mesa-dev
mailing list