[Mesa-dev] [PATCH] i965: Separate base offset/constant offset combining from remapping.

Wed Dec 9 12:08:51 PST 2015

On Wednesday, December 09, 2015 08:03:25 AM Jason Ekstrand wrote:
> On Dec 9, 2015 2:51 AM, "Kenneth Graunke" <kenneth at whitecape.org> wrote:
> >
> > My tessellation branch has two additional remap functions.  I don't want
> > to replicate this logic there.
> >
> > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> > ---
> >  src/mesa/drivers/dri/i965/brw_nir.c | 78
> ++++++++++++++++++++++++-------------
> >  1 file changed, 50 insertions(+), 28 deletions(-)
> >
> > Hey Jason,
> >
> > If you like this patch, and haven't yet merged your NIR input reworks,
> > feel free to just squash it into your changes.  Or, we can land it
> > separately after your changes.  It's up to you.
> >
> > Separating this out allows me to reuse this in my new tessellation input
> > and output remapping functions, and also means we don't need to add
> structs
> > for the remap functions...we can just pass the builder, or inputs_read, or
> > the VUE map...and not have to pack multiple things together.
> 
> Sure.  It does make things simpler.
> 
> >  --Ken
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
> b/src/mesa/drivers/dri/i965/brw_nir.c
> > index 14ad172..105a175 100644
> > --- a/src/mesa/drivers/dri/i965/brw_nir.c
> > +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> > @@ -27,15 +27,19 @@
> >  #include "glsl/nir/nir_builder.h"
> >  #include "program/prog_to_nir.h"
> >
> > -struct remap_vs_attrs_state {
> > -   nir_builder b;
> > -   uint64_t inputs_read;
> > -};
> > -
> > +/**
> > + * In many cases, we just add the base and offset together, so there's no
> > + * reason to keep them separate.  Sometimes, combining them is essential:
> > + * if a shader only accesses part of a compound variable (such as a
> matrix
> > + * or array), the variable's base may not actually exist in the VUE map.
> > + *
> > + * This pass adds constant offsets to instr->const_index[0], and resets
> > + * the offset source to 0.  Non-constant offsets remain unchanged.
> > + */
> >  static bool
> > -remap_vs_attrs(nir_block *block, void *void_state)
> > +add_const_offset_to_base(nir_block *block, void *closure)
> >  {
> > -   struct remap_vs_attrs_state *state = void_state;
> > +   nir_builder *b = closure;
> >
> >     nir_foreach_instr_safe(block, instr) {
> >        if (instr->type != nir_instr_type_intrinsic)
> > @@ -43,30 +47,48 @@ remap_vs_attrs(nir_block *block, void *void_state)
> >
> >        nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
> >
> > +      if (intrin->intrinsic == nir_intrinsic_load_input ||
> > +          intrin->intrinsic == nir_intrinsic_load_per_vertex_input ||
> > +          intrin->intrinsic == nir_intrinsic_load_output ||
> > +          intrin->intrinsic == nir_intrinsic_load_per_vertex_output ||
> > +          intrin->intrinsic == nir_intrinsic_store_output ||
> > +          intrin->intrinsic == nir_intrinsic_store_per_vertex_output) {
> 
> This seems a bit scortched-earth. It would be nice if the caller had a bit
> more control.

We could always add "do input"/"do output" options once we actually need
them.  It means adding the structs back, but that's fine.

I suppose for now we could ignore output intrinsics here.

> > +         nir_src *offset = nir_get_io_offset_src(intrin);
> > +         nir_const_value *const_offset = nir_src_as_const_value(*offset);
> > +
> > +         if (const_offset) {
> > +            intrin->const_index[0] += const_offset->u[0];
> > +            b->cursor = nir_before_instr(&intrin->instr);
> > +            nir_instr_rewrite_src(&intrin->instr, offset,
> > +                                  nir_src_for_ssa(nir_imm_int(b, 0)));
> > +         }
> 
> Else???  It seems that you don't want to run this pass if you think you'll
> ever hit an indirect.  I guess it's harmless to just do this for all direct
> things in our driver, but it doesn't sit well.

Else do nothing.   The problem I'm trying to avoid with this logic is
that inputs/outputs which take up multiple slots and are directly
accessed may only have slots assigned for the sub-components that are
actually used.  So, I can't use the base offset, and need to move the
base to the slot that's actually used.

If something is accessed indirectly, we assign all of its slots, so
using the base location is fine.

> > +      }
> > +   }
> > +   return true;
> > +
> > +}
> > +
> > +static bool
> > +remap_vs_attrs(nir_block *block, void *closure)
> > +{
> > +   GLbitfield64 inputs_read = *((GLbitfield64 *) closure);
> > +
> > +   nir_foreach_instr(block, instr) {
> > +      if (instr->type != nir_instr_type_intrinsic)
> > +         continue;
> > +
> > +      nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
> > +
> >        if (intrin->intrinsic == nir_intrinsic_load_input) {
> >           /* Attributes come in a contiguous block, ordered by their
> >            * gl_vert_attrib value.  That means we can compute the slot
> >            * number for an attribute by masking out the enabled attributes
> >            * before it and counting the bits.
> >            */
> > -         nir_const_value *const_offset =
> nir_src_as_const_value(intrin->src[0]);
> > -
> > -         /* We set EmitNoIndirect for VS inputs, so there are no
> indirects. */
> > -         assert(const_offset);
> > -
> > -         int attr = intrin->const_index[0] + const_offset->u[0];
> > -         int slot = _mesa_bitcount_64(state->inputs_read &
> > -                                      BITFIELD64_MASK(attr));
> > +         int attr = intrin->const_index[0];
> > +         int slot = _mesa_bitcount_64(inputs_read &
> BITFIELD64_MASK(attr));
> >
> > -         /* The NIR -> FS pass will just add the base and offset
> together, so
> > -          * there's no reason to keep them separate.  Just put it all in
> > -          * const_index[0] and set the offset src[0] to load_const(0).
> > -          */
> >           intrin->const_index[0] = 4 * slot;
> > -
> > -         state->b.cursor = nir_before_instr(&intrin->instr);
> > -         nir_instr_rewrite_src(&intrin->instr, &intrin->src[0],
> > -                               nir_src_for_ssa(nir_imm_int(&state->b,
> 0)));
> >        }
> >     }
> >     return true;
> > @@ -97,17 +119,17 @@ brw_nir_lower_inputs(nir_shader *nir,
> >            * key->inputs_read since the two are identical aside from
> Gen4-5
> >            * edge flag differences.
> >            */
> > -         struct remap_vs_attrs_state remap_state = {
> > -            .inputs_read = nir->info.inputs_read,
> > -         };
> > +         GLbitfield64 inputs_read = nir->info.inputs_read;
> >
> >           /* This pass needs actual constants */
> >           nir_opt_constant_folding(nir);
> >
> >           nir_foreach_overload(nir, overload) {
> >              if (overload->impl) {
> > -               nir_builder_init(&remap_state.b, overload->impl);
> > -               nir_foreach_block(overload->impl, remap_vs_attrs,
> &remap_state);
> > +               nir_builder b;
> > +               nir_builder_init(&b, overload->impl);
> > +               nir_foreach_block(overload->impl,
> add_const_offset_to_base, &b);
> 
> What really bothers me is that this call is in remap_vs_attrs but it stomps
> on store_output intrinsics.  That seems bad.

It's not really part of remap_vs_attrs, but you're right, it is part of
brw_nir_lower_inputs().  So affecting outputs seems rather dirty.

How about we just drop outputs for now, and I can fix it in my series?

> Overall, I think this is a fine approach but I wish it were more
> fine-grained.  Unfortunately, I don't have any hot ideas on how to make it
> better off-hand.
> 
> --Jason
> 
> > +               nir_foreach_block(overload->impl, remap_vs_attrs,
> &inputs_read);
> >              }
> >           }
> >        }
> > --
> > 2.6.3
> >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151209/078ecc57/attachment.sig>