[Mesa-dev] [PATCH v2 16/30] i965/fs: Add do_untyped_vector_read helper

Francisco Jerez currojerez at riseup.net
Sat May 14 00:12:11 UTC 2016


Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:

> From: Iago Toral Quiroga <itoral at igalia.com>
>
> We are going to need the same logic for anything that reads
> doubles via untyped messages (CS shared variables and SSBOs). Add a
> helper function with that logic so that we can reuse it.
>
> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/brw_fs.h       |  6 +++
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 64 ++++++++++++++++++++++++++++++++
>  2 files changed, 70 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h
> index 1970ad0..1aeacae 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -110,6 +110,12 @@ public:
>                                                  const fs_reg src,
>                                                  uint32_t components);
>  
> +   void do_untyped_vector_read(const brw::fs_builder &bld,
> +                               const fs_reg surf_index,
> +                               const fs_reg offset_reg,
> +                               const fs_reg dest,
> +                               unsigned num_components);
> +
>     bool run_fs(bool do_rep_send);
>     bool run_vs(gl_clip_plane *clip_planes);
>     bool run_tcs_single_patch();
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 02f1e81..4af5979 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -3106,6 +3106,70 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder &bld,
>  }
>  
>  void
> +fs_visitor::do_untyped_vector_read(const fs_builder &bld,
> +                                   const fs_reg surf_index,
> +                                   const fs_reg offset_reg,
> +                                   const fs_reg dest,

Usually we put the destination register before any source registers in
function argument lists.

> +                                   unsigned num_components)
> +{
> +   if (type_sz(dest.type) <= 4) {

The code below isn't going to work for type_sz() < 4, maybe make the
condition 'type_sz(..) == 4'?

> +      fs_reg read_result = emit_untyped_read(bld, surf_index, offset_reg,
> +                                             1 /* dims */,
> +                                             num_components,
> +                                             BRW_PREDICATE_NONE);
> +      read_result.type = dest.type;
> +      for (unsigned i = 0; i < num_components; i++)
> +         bld.MOV(offset(dest, bld, i), offset(read_result, bld, i));
> +   } else {

This can only work for type_sz() == 8, how about you make this an 'else
if' statement and put an else statement at the end with a single
unreachable() call in it?

> +      /* Reading a dvec, so we need to:
> +       *
> +       * 1. Multiply num_components by 2, to account for the fact that we
> +       *    need to read 64-bit components.
> +       * 2. Shuffle the result of the load to form valid 64-bit elements
> +       * 3. Emit a second load (for components z/w) if needed.
> +       */
> +      fs_reg read_offset = vgrf(glsl_type::uint_type);
> +      bld.MOV(read_offset, offset_reg);
> +
> +      int iters = num_components <= 2 ? 1 : 2;
> +
> +      /* Load the dvec, the first iteration loads components x/y, the second
> +       * iteration, if needed, loads components z/w
> +       */
> +      for (int it = 0; it < iters; it++) {
> +         /* Compute number of components to read in this iteration */
> +         int iter_components = MIN2(2, num_components);
> +         num_components -= iter_components;
> +
> +         /* Read. Since this message reads 32-bit components, we need to
> +          * read twice as many components.
> +          */
> +         fs_reg read_result = emit_untyped_read(bld, surf_index, read_offset,
> +                                                1 /* dims */,
> +                                                iter_components * 2,
> +                                                BRW_PREDICATE_NONE);
> +

You'd save some retypes in the code below (actually all of them -- and
also avoid inadvertent type conversion if the destination type is an
integer) if you allocated a separate virtual register here for the
packed 64bit result, like:

|    const fs_reg packed_result = bld.vgrf(dest.type, iter_components);

and then use it as destination for the shuffle function.  With the above
taken into account:

Reviewed-by: Francisco Jerez <currojerez at riseup.net>

> +         /* Shuffle the 32-bit load result into valid 64-bit data */
> +         shuffle_32bit_load_result_to_64bit_data(
> +            bld,
> +            retype(read_result, BRW_REGISTER_TYPE_DF),
> +            retype(read_result, BRW_REGISTER_TYPE_F),
> +            iter_components);
> +
> +         /* Move each component to its destination */
> +         read_result = retype(read_result, BRW_REGISTER_TYPE_DF);
> +         for (int c = 0; c < iter_components; c++) {
> +            bld.MOV(offset(dest, bld, it * 2 + c),
> +                    offset(read_result, bld, c));
> +         }
> +
> +         bld.ADD(read_offset, read_offset, brw_imm_ud(16));
> +      }
> +   }
> +}
> +
> +
> +void
>  fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr)
>  {
>     fs_reg dest;
> -- 
> 2.5.0
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160513/3e78e7f7/attachment.sig>


More information about the mesa-dev mailing list