[Mesa-dev] [PATCH v3 13/24] i965/vec4: split DF instructions and later double its execsize in IVB/BYT
Francisco Jerez
currojerez at riseup.net
Fri Mar 3 23:32:39 UTC 2017
Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> We need to split DF instructions in two on IVB/BYT as it needs an
> execsize 8 to process 4 DF values (one GRF in total).
>
> v2:
> - Rename helper and make it static inline function (Matt).
> - Fix indention and add braces (Matt).
>
> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
> ---
> src/mesa/drivers/dri/i965/brw_ir_vec4.h | 14 ++++++++++++++
> src/mesa/drivers/dri/i965/brw_vec4.cpp | 7 ++++++-
> src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 15 +++++++++++++--
> 3 files changed, 33 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> index 57fc6be8f89..9d29c3fb944 100644
> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> @@ -405,6 +405,20 @@ regs_read(const vec4_instruction *inst, unsigned i)
> reg_size);
> }
>
> +static inline unsigned
> +get_exec_type_size(const vec4_instruction *inst)
> +{
> + unsigned exec_type_size = 0;
> +
> + for (int i = 0; i < 3; i++) {
> + if (inst->src[i].type != BAD_FILE) {
> + exec_type_size = MAX2(exec_type_size, type_sz(inst->src[i].type));
> + }
> + }
> +
> + return exec_type_size;
> +}
> +
> } /* namespace brw */
>
> #endif
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 5e60eb657a7..7080c93e550 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -2092,6 +2092,10 @@ get_lowered_simd_width(const struct gen_device_info *devinfo,
> if (inst->opcode == BRW_OPCODE_SEL && type_sz(inst->dst.type) == 8)
> lowered_width = MIN2(lowered_width, 4);
>
Maybe add a short comment here explaining why you need to do this?
> + if (devinfo->gen == 7 && !devinfo->is_haswell &&
> + (get_exec_type_size(inst) == 8 || type_sz(inst->dst.type) == 8))
> + lowered_width = MIN2(lowered_width, 4);
> +
> /* HSW PRM, 3D Media GPGPU Engine, Region Alignment Rules for Direct
> * Register Addressing:
> *
> @@ -2199,7 +2203,8 @@ vec4_visitor::lower_simd_width()
> inst->insert_before(block, copy);
> }
> } else {
> - dst = horiz_offset(inst->dst, channel_offset);
> + if (inst->dst.file != ARF)
> + dst = horiz_offset(inst->dst, channel_offset);
This doesn't look right, you need to give the same treatment to ARF
registers as to other registers. If what you're trying to avoid here is
shifting the null register incorrectly, I suggest you fix horiz_offset()
to return the argument unchanged if it's the null register, because the
null register logically behaves like a scalar register (this is also
consistent with the way the FS back-end handles the same situation).
> }
> linst->dst = dst;
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index 847a01bd43c..7bb1ab1879c 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -1522,14 +1522,25 @@ generate_code(struct brw_codegen *p,
> brw_set_default_saturate(p, inst->saturate);
> brw_set_default_mask_control(p, inst->force_writemask_all);
> brw_set_default_acc_write_control(p, inst->writes_accumulator);
> - brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
>
> - assert(inst->group % inst->exec_size == 0);
> + bool is_ivb_df = devinfo->gen == 7 &&
> + !devinfo->is_haswell &&
> + (get_exec_type_size(inst) == 8 ||
> + inst->dst.type == BRW_REGISTER_TYPE_DF);
> +
> + assert(inst->group % inst->exec_size == 0 ||
> + is_ivb_df);
> +
> assert(inst->group % 8 == 0 ||
> inst->dst.type == BRW_REGISTER_TYPE_DF ||
> inst->src[0].type == BRW_REGISTER_TYPE_DF ||
> inst->src[1].type == BRW_REGISTER_TYPE_DF ||
> inst->src[2].type == BRW_REGISTER_TYPE_DF);
> +
> + if (is_ivb_df && inst->exec_size < 8)
> + inst->exec_size *= 2;
> + brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
> +
Same comment here as for its FS counterpart... Please let's not modify
the IR from the generator.
> if (!inst->force_writemask_all)
> brw_set_default_group(p, inst->group);
>
> --
> 2.11.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170303/902b5f39/attachment.sig>
More information about the mesa-dev
mailing list