[Mesa-dev] [PATCH 08/16] nir: lower-io-types pass

Rob Clark robdclark at gmail.com
Sat Apr 2 18:50:05 UTC 2016


On Fri, Apr 1, 2016 at 1:46 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>
>
> On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark <robdclark at gmail.com> wrote:
>>
>> From: Rob Clark <robclark at freedesktop.org>
>>
>> A pass to lower complex (struct/array/mat) inputs/outputs to primitive
>> types.  This allows, for example, linking that removes unused components
>> of a larger type which is not indirectly accessed.
>>
>> In the near term, it is needed for gallium (mesa/st) support for NIR,
>> since only used components of a type are assigned VBO slots, and we
>> otherwise have no way to represent that to the driver backend.  But it
>> should be useful for doing shader linking in NIR.
>>
>> Signed-off-by: Rob Clark <robclark at freedesktop.org>
>> ---
>>  src/compiler/Makefile.sources         |   1 +
>>  src/compiler/nir/nir.h                |   1 +
>>  src/compiler/nir/nir_lower_io_types.c | 178
>> ++++++++++++++++++++++++++++++++++
>>  3 files changed, 180 insertions(+)
>>  create mode 100644 src/compiler/nir/nir_lower_io_types.c
>>
>> diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
>> index ae7efbf..8f1ffdc 100644
>> --- a/src/compiler/Makefile.sources
>> +++ b/src/compiler/Makefile.sources
>> @@ -196,6 +196,7 @@ NIR_FILES = \
>>         nir/nir_lower_idiv.c \
>>         nir/nir_lower_indirect_derefs.c \
>>         nir/nir_lower_io.c \
>> +       nir/nir_lower_io_types.c \
>>         nir/nir_lower_outputs_to_temporaries.c \
>>         nir/nir_lower_passthrough_edgeflags.c \
>>         nir/nir_lower_phis_to_scalar.c \
>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
>> index 65f75bd..7f7c459 100644
>> --- a/src/compiler/nir/nir.h
>> +++ b/src/compiler/nir/nir.h
>> @@ -2172,6 +2172,7 @@ void nir_lower_io(nir_shader *shader,
>>  nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr);
>>  nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr);
>>
>> +void nir_lower_io_types(nir_shader *shader, int (*type_size)(const struct
>> glsl_type *));
>>  void nir_lower_vars_to_ssa(nir_shader *shader);
>>
>>  bool nir_remove_dead_variables(nir_shader *shader);
>> diff --git a/src/compiler/nir/nir_lower_io_types.c
>> b/src/compiler/nir/nir_lower_io_types.c
>> new file mode 100644
>> index 0000000..b7d9ebe
>> --- /dev/null
>> +++ b/src/compiler/nir/nir_lower_io_types.c
>> @@ -0,0 +1,178 @@
>> +/*
>> + * Copyright © 2016 Red Hat
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining
>> a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the
>> next
>> + * paragraph) shall be included in all copies or substantial portions of
>> the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING FROM,
>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
>> IN THE
>> + * SOFTWARE.
>> + */
>> +
>> +#include "nir.h"
>> +#include "nir_builder.h"
>> +
>> +/* Lower complex (struct/array/mat) input and output vars to primitive
>> types
>> + * (vec4) for linking.  All indirect input/output access should already
>> be
>> + * lowered (ie. nir_lower_io_to_temporaries).
>> + */
>> +
>> +struct lower_io_types_state {
>> +   nir_shader *shader;
>> +   struct exec_list new_ins;
>> +   struct exec_list new_outs;
>> +   int (*type_size)(const struct glsl_type *);
>> +};
>> +
>> +static nir_variable *
>> +get_new_var(struct lower_io_types_state *state, nir_variable *var,
>> unsigned off)
>> +{
>> +   struct exec_list *list;
>> +
>> +   if (var->data.mode == nir_var_shader_in) {
>> +      list = &state->new_ins;
>> +   } else {
>> +      assert(var->data.mode == nir_var_shader_out);
>> +      list = &state->new_outs;
>> +   }
>> +
>> +   nir_foreach_variable(nvar, list) {
>> +      if (nvar->data.location == (var->data.location + off))
>> +         return nvar;
>> +   }
>
>
> Doing a linear search here could get expensive.  Why not just have an array
> that maps locations to variables?  I think you have a maximum of 64 possible
> locations (if you disregard tess) so it's not that memory-intensive.

hmm, list of intputs and outputs is pretty small/bounded..  do you
*really* expect it to be a problem?

>>
>> +
>> +   /* doesn't already exist, so we need to create a new one: */
>> +   /* TODO figure out if scalar vs vec, and if float/int/uint/(double?)
>> +    * do we need to fixup interpolation mode for int vs float components
>> +    * of a struct, etc..
>> +    */
>> +   const struct glsl_type *ntype = glsl_vec4_type();
>> +   nir_variable *nvar = nir_variable_create(state->shader,
>> var->data.mode,
>> +                                            ntype, NULL);
>
>
> Do you want to create a new one or clone?  Cloning seems better because that
> would ensure that interp qualifiers etc. get copied over.

I started out that way.. but decided to avoid it for some reason..  I
think just to avoid the duplicate ralloc_asprintf() for the name, but
there might have been another reason..

> Also, I seem to recall it being possible for outputs to have constant
> initializers coming out of GLSL IR.  This pass doesn't handle that.

hmm, I haven't come across any piglit that triggered this.. if you
have some ideas for a .shader_test I could try..

>>
>> +
>> +   nvar->name = ralloc_asprintf(nvar, "%s@%u", var->name, off);
>> +   nvar->data = var->data;
>> +   nvar->data.location += off;
>> +
>> +   /* nir_variable_create is too clever for it's own good: */
>> +   exec_node_remove(&nvar->node);
>> +   exec_node_self_link(&nvar->node);      /* no delinit() :-( */
>
>
> This isn't needed.  Re-insertion just stomps whatever was there before.

maybe.. tbh I'd kinda prefer if _remove() spent the extra couple
instructions to _self_link() rather than leaving booby-traps.  List
stuff can fail in spectacular fashion if you get that sort of thing
wrong and extra couple instructions is lost in the noise.  (Plus,
everything is already in the cache at this point, so you'll have a
hard time proving to me that you could measure a perf diff..)

> Incidentally, if you had an array, you could just let nir_variable_create be
> clever.
>
>>
>> +
>> +   exec_list_push_tail(list, &nvar->node);
>> +
>> +   /* remove existing var from input/output list: */
>> +   exec_node_remove(&var->node);
>> +   exec_node_self_link(&var->node);
>
>
> This isn't needed either.
>
>>
>> +
>> +   return nvar;
>> +}
>> +
>> +static unsigned
>> +get_deref_offset(struct lower_io_types_state *state, nir_deref *tail)
>> +{
>> +   unsigned offset = 0;
>> +
>> +   while (tail->child != NULL) {
>> +      const struct glsl_type *parent_type = tail->type;
>> +      tail = tail->child;
>> +
>> +      if (tail->deref_type == nir_deref_type_array) {
>> +         nir_deref_array *deref_array = nir_deref_as_array(tail);
>> +
>> +         /* indirect inputs/outputs should already be lowered! */
>> +         assert(deref_array->deref_array_type ==
>> nir_deref_array_type_direct);
>> +
>> +         unsigned size = state->type_size(tail->type);
>> +
>> +         offset += size * deref_array->base_offset;
>> +      } else if (tail->deref_type == nir_deref_type_struct) {
>> +         nir_deref_struct *deref_struct = nir_deref_as_struct(tail);
>> +
>> +         for (unsigned i = 0; i < deref_struct->index; i++) {
>> +            offset += state->type_size(glsl_get_struct_field(parent_type,
>> i));
>> +         }
>> +      }
>> +   }
>> +
>> +   return offset;
>> +}
>> +
>> +static bool
>> +lower_io_types_block(nir_block *block, void *void_state)
>> +{
>> +   struct lower_io_types_state *state = void_state;
>> +
>> +   nir_foreach_instr(block, instr) {
>> +      if (instr->type != nir_instr_type_intrinsic)
>> +         continue;
>> +
>> +      nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr);
>> +
>> +      if ((intr->intrinsic != nir_intrinsic_load_var) &&
>> +          (intr->intrinsic != nir_intrinsic_store_var))
>> +         continue;
>
>
> You should say somewhere in the top comment that copies need to also be
> lowered prior to this pass.

ok, I'll add a comment

>>
>> +
>> +      nir_variable *var = intr->variables[0]->var;
>> +
>> +      if ((var->data.mode != nir_var_shader_in) &&
>> +          (var->data.mode != nir_var_shader_out))
>> +         continue;
>> +
>> +      /* no need to split up if already primitive */
>> +      if (state->type_size(var->type) == 1)
>> +         continue;
>> +
>> +      unsigned off = get_deref_offset(state, &intr->variables[0]->deref);
>> +      nir_variable *nvar = get_new_var(state, var, off);
>> +
>> +      /* and then re-write the load/store_var deref: */
>> +      intr->variables[0] = nir_deref_var_create(intr, nvar);
>> +   }
>> +
>> +   return true;
>> +}
>> +
>> +static void
>> +lower_io_types_impl(nir_function_impl *impl, struct lower_io_types_state
>> *state)
>> +{
>> +   nir_foreach_block(impl, lower_io_types_block, state);
>> +
>> +   nir_metadata_preserve(impl, nir_metadata_block_index |
>> +                               nir_metadata_dominance);
>> +}
>> +
>> +
>> +void
>> +nir_lower_io_types(nir_shader *shader, int (*type_size)(const struct
>> glsl_type *))
>
>
> Just a side-note (not a request) but we really should have a
> "glsl_type_size_func" typedef somewhere.

not a bad idea..  although maybe I'll punt until gallium and i965
type_size fxns are unified (in which case we could drop this arg)..

> Also, what happened to just using nir_type_size_vec4?  Incidentally, I think
> you could also use the newly exposed component_slots function from my
> series.

After the lower_io patch, I didn't need the type_size fxn from
freedreno/ir3.  That and I punted on sorting out the small deltas
between the gallium and i965 versions.. you seemed to thing some of
those diffs might not matter, but I've had this series dragging on for
long enough that I figured I'd avoid introducing dependencies on
sorting that out.  Not like there are a lot of call-sites for this
pass, so I figured no real problem w/ dropping the type_size fxn ptr
in a later patchset.

BR,
-R

>>
>> +{
>> +   struct lower_io_types_state state;
>> +
>> +   /* NOTE: only operates on units of vec4 slots: */
>> +   assert(type_size(glsl_vec4_type()) == 1);
>> +
>> +   state.shader = shader;
>> +   exec_list_make_empty(&state.new_ins);
>> +   exec_list_make_empty(&state.new_outs);
>> +   state.type_size = type_size;
>> +
>> +   nir_foreach_function(shader, function) {
>> +      if (function->impl)
>> +         lower_io_types_impl(function->impl, &state);
>> +   }
>> +
>> +   /* move new in/out vars to shader's lists: */
>> +   exec_list_append(&shader->inputs, &state.new_ins);
>> +   exec_list_append(&shader->outputs, &state.new_outs);
>> +}
>> --
>> 2.5.5
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>


More information about the mesa-dev mailing list