[Mesa-dev] [PATCH 36/95] i965/vec4: add a helper function to create double immediates
Francisco Jerez
currojerez at riseup.net
Wed Aug 3 20:28:07 UTC 2016
Iago Toral Quiroga <itoral at igalia.com> writes:
> Gen7 hardware does not support double immediates so these need
> to be moved in 32-bit chunks to a regular vgrf instead. Instead
> of doing this every time we need to create a DF immediate,
> create a helper function that does the right thing depending
> on the hardware generation.
>
> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
> ---
> src/mesa/drivers/dri/i965/brw_vec4.h | 2 ++
> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 40 ++++++++++++++++++++++++++++++
> 2 files changed, 42 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h
> index 4650ae0..cf7cdab 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -320,6 +320,8 @@ public:
> void emit_single_to_double(dst_reg dst, src_reg src, bool saturate,
> brw_reg_type single_type);
>
> + src_reg setup_imm_df(double v);
> +
> virtual void emit_nir_code();
> virtual void nir_setup_uniforms();
> virtual void nir_setup_system_value_intrinsic(nir_intrinsic_instr *instr);
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index d7c6bf4..8a45fde 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -1063,6 +1063,46 @@ vec4_visitor::emit_single_to_double(dst_reg dst, src_reg src, bool saturate,
> inst->saturate = saturate;
> }
>
> +src_reg
> +vec4_visitor::setup_imm_df(double v)
> +{
> + assert(devinfo->gen >= 7);
> +
> + if (devinfo->gen >= 8)
> + return brw_imm_df(v);
> +
> + /* gen7 does not support DF immediates */
> + union {
> + double d;
> + struct {
> + uint32_t i1;
> + uint32_t i2;
> + };
> + } di;
> +
> + di.d = v;
> +
> + /* Write the low 32-bit of the constant to the X:UD channel and the
> + * high 32-bit to the Y:UD channel to build the constant in a VGRF.
> + * We have to do this twice (offset 0 and offset 1), since a DF VGRF takes
> + * two SIMD8 registers in SIMD4x2 execution. Finally, return a swizzle
> + * XXXX so any access to the VGRF only reads the constant data in these
> + * channels.
> + */
> + dst_reg tmp = dst_reg(VGRF, alloc.allocate(2));
> + tmp.type = BRW_REGISTER_TYPE_UD;
> + for (int n = 0; n < 2; n++) {
> + tmp.writemask = WRITEMASK_X;
If you used the writemask helper instead the tmp register could be
declared const.
> + emit(MOV(offset(tmp, n), brw_imm_ud(di.i1)))->force_writemask_all = true;
> + tmp.writemask = WRITEMASK_Y;
> + emit(MOV(offset(tmp, n), brw_imm_ud(di.i2)))->force_writemask_all = true;
Because you're adding support for arbitrary exec sizes to the VEC4
back-end, wouldn't it make sense to use SIMD16 instructions here instead
of the loop? (At least on HSW, on IVB it would end up being lowered
into four uncompressed instructions anyway)
> + }
> +
> + src_reg tmp_as_src = src_reg(retype(tmp, BRW_REGISTER_TYPE_DF));
> + tmp_as_src.swizzle = BRW_SWIZZLE_XXXX;
> + return tmp_as_src;
You could use the swizzle() helper instead of the last three lines:
| return swizzle(retype(tmp, BRW_REGISTER_TYPE_DF), BRW_SWIZZLE_XXX));
> +}
> +
> void
> vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
> {
> --
> 2.7.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160803/c1345498/attachment.sig>
More information about the mesa-dev
mailing list