[Mesa-dev] [PATCH 08/16] nir: lower-io-types pass

Rob Clark robdclark at gmail.com
Sat Apr 2 21:13:26 UTC 2016


On Sat, Apr 2, 2016 at 4:49 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>
>
> On Sat, Apr 2, 2016 at 11:50 AM, Rob Clark <robdclark at gmail.com> wrote:
>>
>> On Fri, Apr 1, 2016 at 1:46 PM, Jason Ekstrand <jason at jlekstrand.net>
>> wrote:
>> >
>> >
>> > On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark <robdclark at gmail.com> wrote:
>> >>
>> >> From: Rob Clark <robclark at freedesktop.org>
>> >>
>> >> A pass to lower complex (struct/array/mat) inputs/outputs to primitive
>> >> types.  This allows, for example, linking that removes unused
>> >> components
>> >> of a larger type which is not indirectly accessed.
>> >>
>> >> In the near term, it is needed for gallium (mesa/st) support for NIR,
>> >> since only used components of a type are assigned VBO slots, and we
>> >> otherwise have no way to represent that to the driver backend.  But it
>> >> should be useful for doing shader linking in NIR.
>> >>
>> >> Signed-off-by: Rob Clark <robclark at freedesktop.org>
>> >> ---
>> >>  src/compiler/Makefile.sources         |   1 +
>> >>  src/compiler/nir/nir.h                |   1 +
>> >>  src/compiler/nir/nir_lower_io_types.c | 178
>> >> ++++++++++++++++++++++++++++++++++
>> >>  3 files changed, 180 insertions(+)
>> >>  create mode 100644 src/compiler/nir/nir_lower_io_types.c
>> >>
>> >> diff --git a/src/compiler/Makefile.sources
>> >> b/src/compiler/Makefile.sources
>> >> index ae7efbf..8f1ffdc 100644
>> >> --- a/src/compiler/Makefile.sources
>> >> +++ b/src/compiler/Makefile.sources
>> >> @@ -196,6 +196,7 @@ NIR_FILES = \
>> >>         nir/nir_lower_idiv.c \
>> >>         nir/nir_lower_indirect_derefs.c \
>> >>         nir/nir_lower_io.c \
>> >> +       nir/nir_lower_io_types.c \
>> >>         nir/nir_lower_outputs_to_temporaries.c \
>> >>         nir/nir_lower_passthrough_edgeflags.c \
>> >>         nir/nir_lower_phis_to_scalar.c \
>> >> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
>> >> index 65f75bd..7f7c459 100644
>> >> --- a/src/compiler/nir/nir.h
>> >> +++ b/src/compiler/nir/nir.h
>> >> @@ -2172,6 +2172,7 @@ void nir_lower_io(nir_shader *shader,
>> >>  nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr);
>> >>  nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr);
>> >>
>> >> +void nir_lower_io_types(nir_shader *shader, int (*type_size)(const
>> >> struct
>> >> glsl_type *));
>> >>  void nir_lower_vars_to_ssa(nir_shader *shader);
>> >>
>> >>  bool nir_remove_dead_variables(nir_shader *shader);
>> >> diff --git a/src/compiler/nir/nir_lower_io_types.c
>> >> b/src/compiler/nir/nir_lower_io_types.c
>> >> new file mode 100644
>> >> index 0000000..b7d9ebe
>> >> --- /dev/null
>> >> +++ b/src/compiler/nir/nir_lower_io_types.c
>> >> @@ -0,0 +1,178 @@
>> >> +/*
>> >> + * Copyright © 2016 Red Hat
>> >> + *
>> >> + * Permission is hereby granted, free of charge, to any person
>> >> obtaining
>> >> a
>> >> + * copy of this software and associated documentation files (the
>> >> "Software"),
>> >> + * to deal in the Software without restriction, including without
>> >> limitation
>> >> + * the rights to use, copy, modify, merge, publish, distribute,
>> >> sublicense,
>> >> + * and/or sell copies of the Software, and to permit persons to whom
>> >> the
>> >> + * Software is furnished to do so, subject to the following
>> >> conditions:
>> >> + *
>> >> + * The above copyright notice and this permission notice (including
>> >> the
>> >> next
>> >> + * paragraph) shall be included in all copies or substantial portions
>> >> of
>> >> the
>> >> + * Software.
>> >> + *
>> >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> >> EXPRESS OR
>> >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> >> MERCHANTABILITY,
>> >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> >> SHALL
>> >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES
>> >> OR
>> >> OTHER
>> >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> >> ARISING FROM,
>> >> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> >> DEALINGS
>> >> IN THE
>> >> + * SOFTWARE.
>> >> + */
>> >> +
>> >> +#include "nir.h"
>> >> +#include "nir_builder.h"
>> >> +
>> >> +/* Lower complex (struct/array/mat) input and output vars to primitive
>> >> types
>> >> + * (vec4) for linking.  All indirect input/output access should
>> >> already
>> >> be
>> >> + * lowered (ie. nir_lower_io_to_temporaries).
>> >> + */
>> >> +
>> >> +struct lower_io_types_state {
>> >> +   nir_shader *shader;
>> >> +   struct exec_list new_ins;
>> >> +   struct exec_list new_outs;
>> >> +   int (*type_size)(const struct glsl_type *);
>> >> +};
>> >> +
>> >> +static nir_variable *
>> >> +get_new_var(struct lower_io_types_state *state, nir_variable *var,
>> >> unsigned off)
>> >> +{
>> >> +   struct exec_list *list;
>> >> +
>> >> +   if (var->data.mode == nir_var_shader_in) {
>> >> +      list = &state->new_ins;
>> >> +   } else {
>> >> +      assert(var->data.mode == nir_var_shader_out);
>> >> +      list = &state->new_outs;
>> >> +   }
>> >> +
>> >> +   nir_foreach_variable(nvar, list) {
>> >> +      if (nvar->data.location == (var->data.location + off))
>> >> +         return nvar;
>> >> +   }
>> >
>> >
>> > Doing a linear search here could get expensive.  Why not just have an
>> > array
>> > that maps locations to variables?  I think you have a maximum of 64
>> > possible
>> > locations (if you disregard tess) so it's not that memory-intensive.
>>
>> hmm, list of intputs and outputs is pretty small/bounded..  do you
>> *really* expect it to be a problem?
>
>
> No, it should be fine.  It just bothers me a bit :-P
>
>>
>> >>
>> >> +
>> >> +   /* doesn't already exist, so we need to create a new one: */
>> >> +   /* TODO figure out if scalar vs vec, and if
>> >> float/int/uint/(double?)
>> >> +    * do we need to fixup interpolation mode for int vs float
>> >> components
>> >> +    * of a struct, etc..
>> >> +    */
>> >> +   const struct glsl_type *ntype = glsl_vec4_type();
>> >> +   nir_variable *nvar = nir_variable_create(state->shader,
>> >> var->data.mode,
>> >> +                                            ntype, NULL);
>> >
>> >
>> > Do you want to create a new one or clone?  Cloning seems better because
>> > that
>> > would ensure that interp qualifiers etc. get copied over.
>>
>> I started out that way.. but decided to avoid it for some reason..  I
>> think just to avoid the duplicate ralloc_asprintf() for the name, but
>> there might have been another reason..
>>
>> > Also, I seem to recall it being possible for outputs to have constant
>> > initializers coming out of GLSL IR.  This pass doesn't handle that.
>>
>> hmm, I haven't come across any piglit that triggered this.. if you
>> have some ideas for a .shader_test I could try..
>
>
> I don't know of anything off-hand that triggers it.  I just seem to recall
> coming across it at some point.  I guess one could put in an assert
> somewhere and do a piglit run.

I guess if there was something that was failing before in tgsi case, I
might have missed it.  But set of regressions compared to tgsi case is
pretty small and well understood (and mostly related to
variable-indexing issues in ir3 which would be solved if I lowered
vars to regs)

It could ofc be a gap in piglit coverage (or just some test that is
skip'd on freedreno if it needs a higher gl level or some extension we
don't have)..

>>
>> >>
>> >> +
>> >> +   nvar->name = ralloc_asprintf(nvar, "%s@%u", var->name, off);
>> >> +   nvar->data = var->data;
>> >> +   nvar->data.location += off;
>> >> +
>> >> +   /* nir_variable_create is too clever for it's own good: */
>> >> +   exec_node_remove(&nvar->node);
>> >> +   exec_node_self_link(&nvar->node);      /* no delinit() :-( */
>> >
>> >
>> > This isn't needed.  Re-insertion just stomps whatever was there before.
>>
>> maybe.. tbh I'd kinda prefer if _remove() spent the extra couple
>> instructions to _self_link() rather than leaving booby-traps.  List
>> stuff can fail in spectacular fashion if you get that sort of thing
>> wrong and extra couple instructions is lost in the noise.  (Plus,
>> everything is already in the cache at this point, so you'll have a
>> hard time proving to me that you could measure a perf diff..)
>
>
> Sure... I've made that argument on other linked list implementations in the
> past.  Feel free to send the patch and see what the reaction is :-p

heh, sent.. let's see what happens ;-)

>>
>> > Incidentally, if you had an array, you could just let
>> > nir_variable_create be
>> > clever.
>> >
>> >>
>> >> +
>> >> +   exec_list_push_tail(list, &nvar->node);
>> >> +
>> >> +   /* remove existing var from input/output list: */
>> >> +   exec_node_remove(&var->node);
>> >> +   exec_node_self_link(&var->node);
>> >
>> >
>> > This isn't needed either.
>> >
>> >>
>> >> +
>> >> +   return nvar;
>> >> +}
>> >> +
>> >> +static unsigned
>> >> +get_deref_offset(struct lower_io_types_state *state, nir_deref *tail)
>> >> +{
>> >> +   unsigned offset = 0;
>> >> +
>> >> +   while (tail->child != NULL) {
>> >> +      const struct glsl_type *parent_type = tail->type;
>> >> +      tail = tail->child;
>> >> +
>> >> +      if (tail->deref_type == nir_deref_type_array) {
>> >> +         nir_deref_array *deref_array = nir_deref_as_array(tail);
>> >> +
>> >> +         /* indirect inputs/outputs should already be lowered! */
>> >> +         assert(deref_array->deref_array_type ==
>> >> nir_deref_array_type_direct);
>> >> +
>> >> +         unsigned size = state->type_size(tail->type);
>> >> +
>> >> +         offset += size * deref_array->base_offset;
>> >> +      } else if (tail->deref_type == nir_deref_type_struct) {
>> >> +         nir_deref_struct *deref_struct = nir_deref_as_struct(tail);
>> >> +
>> >> +         for (unsigned i = 0; i < deref_struct->index; i++) {
>> >> +            offset +=
>> >> state->type_size(glsl_get_struct_field(parent_type,
>> >> i));
>> >> +         }
>> >> +      }
>> >> +   }
>> >> +
>> >> +   return offset;
>> >> +}
>> >> +
>> >> +static bool
>> >> +lower_io_types_block(nir_block *block, void *void_state)
>> >> +{
>> >> +   struct lower_io_types_state *state = void_state;
>> >> +
>> >> +   nir_foreach_instr(block, instr) {
>> >> +      if (instr->type != nir_instr_type_intrinsic)
>> >> +         continue;
>> >> +
>> >> +      nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr);
>> >> +
>> >> +      if ((intr->intrinsic != nir_intrinsic_load_var) &&
>> >> +          (intr->intrinsic != nir_intrinsic_store_var))
>> >> +         continue;
>> >
>> >
>> > You should say somewhere in the top comment that copies need to also be
>> > lowered prior to this pass.
>>
>> ok, I'll add a comment
>>
>> >>
>> >> +
>> >> +      nir_variable *var = intr->variables[0]->var;
>> >> +
>> >> +      if ((var->data.mode != nir_var_shader_in) &&
>> >> +          (var->data.mode != nir_var_shader_out))
>> >> +         continue;
>> >> +
>> >> +      /* no need to split up if already primitive */
>> >> +      if (state->type_size(var->type) == 1)
>> >> +         continue;
>> >> +
>> >> +      unsigned off = get_deref_offset(state,
>> >> &intr->variables[0]->deref);
>> >> +      nir_variable *nvar = get_new_var(state, var, off);
>> >> +
>> >> +      /* and then re-write the load/store_var deref: */
>> >> +      intr->variables[0] = nir_deref_var_create(intr, nvar);
>> >> +   }
>> >> +
>> >> +   return true;
>> >> +}
>> >> +
>> >> +static void
>> >> +lower_io_types_impl(nir_function_impl *impl, struct
>> >> lower_io_types_state
>> >> *state)
>> >> +{
>> >> +   nir_foreach_block(impl, lower_io_types_block, state);
>> >> +
>> >> +   nir_metadata_preserve(impl, nir_metadata_block_index |
>> >> +                               nir_metadata_dominance);
>> >> +}
>> >> +
>> >> +
>> >> +void
>> >> +nir_lower_io_types(nir_shader *shader, int (*type_size)(const struct
>> >> glsl_type *))
>> >
>> >
>> > Just a side-note (not a request) but we really should have a
>> > "glsl_type_size_func" typedef somewhere.
>>
>> not a bad idea..  although maybe I'll punt until gallium and i965
>> type_size fxns are unified (in which case we could drop this arg)..
>>
>> > Also, what happened to just using nir_type_size_vec4?  Incidentally, I
>> > think
>> > you could also use the newly exposed component_slots function from my
>> > series.
>>
>> After the lower_io patch, I didn't need the type_size fxn from
>> freedreno/ir3.  That and I punted on sorting out the small deltas
>> between the gallium and i965 versions.. you seemed to thing some of
>> those diffs might not matter, but I've had this series dragging on for
>> long enough that I figured I'd avoid introducing dependencies on
>> sorting that out.  Not like there are a lot of call-sites for this
>> pass, so I figured no real problem w/ dropping the type_size fxn ptr
>> in a later patchset.
>
>
> Sure.  I don't want to block on that.  I do think component_slots is
> more-or-less the "correct" thing to do since it uses the exact same units as
> var->data.location.

yeah, I'll have a look at component_slots stuff.. requiring
type_size_vec4 seems a bit awkward in the current patch..

BR,
-R

> I'm not too worried about the other comments.
>
> Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
>
>>
>> BR,
>> -R
>>
>> >>
>> >> +{
>> >> +   struct lower_io_types_state state;
>> >> +
>> >> +   /* NOTE: only operates on units of vec4 slots: */
>> >> +   assert(type_size(glsl_vec4_type()) == 1);
>> >> +
>> >> +   state.shader = shader;
>> >> +   exec_list_make_empty(&state.new_ins);
>> >> +   exec_list_make_empty(&state.new_outs);
>> >> +   state.type_size = type_size;
>> >> +
>> >> +   nir_foreach_function(shader, function) {
>> >> +      if (function->impl)
>> >> +         lower_io_types_impl(function->impl, &state);
>> >> +   }
>> >> +
>> >> +   /* move new in/out vars to shader's lists: */
>> >> +   exec_list_append(&shader->inputs, &state.new_ins);
>> >> +   exec_list_append(&shader->outputs, &state.new_outs);
>> >> +}
>> >> --
>> >> 2.5.5
>> >>
>> >> _______________________________________________
>> >> mesa-dev mailing list
>> >> mesa-dev at lists.freedesktop.org
>> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> >
>> >
>
>


More information about the mesa-dev mailing list