[Mesa-dev] [PATCH 08/16] nir: lower-io-types pass
Jason Ekstrand
jason at jlekstrand.net
Sat Apr 2 20:49:39 UTC 2016
On Sat, Apr 2, 2016 at 11:50 AM, Rob Clark <robdclark at gmail.com> wrote:
> On Fri, Apr 1, 2016 at 1:46 PM, Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> >
> >
> > On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark <robdclark at gmail.com> wrote:
> >>
> >> From: Rob Clark <robclark at freedesktop.org>
> >>
> >> A pass to lower complex (struct/array/mat) inputs/outputs to primitive
> >> types. This allows, for example, linking that removes unused components
> >> of a larger type which is not indirectly accessed.
> >>
> >> In the near term, it is needed for gallium (mesa/st) support for NIR,
> >> since only used components of a type are assigned VBO slots, and we
> >> otherwise have no way to represent that to the driver backend. But it
> >> should be useful for doing shader linking in NIR.
> >>
> >> Signed-off-by: Rob Clark <robclark at freedesktop.org>
> >> ---
> >> src/compiler/Makefile.sources | 1 +
> >> src/compiler/nir/nir.h | 1 +
> >> src/compiler/nir/nir_lower_io_types.c | 178
> >> ++++++++++++++++++++++++++++++++++
> >> 3 files changed, 180 insertions(+)
> >> create mode 100644 src/compiler/nir/nir_lower_io_types.c
> >>
> >> diff --git a/src/compiler/Makefile.sources
> b/src/compiler/Makefile.sources
> >> index ae7efbf..8f1ffdc 100644
> >> --- a/src/compiler/Makefile.sources
> >> +++ b/src/compiler/Makefile.sources
> >> @@ -196,6 +196,7 @@ NIR_FILES = \
> >> nir/nir_lower_idiv.c \
> >> nir/nir_lower_indirect_derefs.c \
> >> nir/nir_lower_io.c \
> >> + nir/nir_lower_io_types.c \
> >> nir/nir_lower_outputs_to_temporaries.c \
> >> nir/nir_lower_passthrough_edgeflags.c \
> >> nir/nir_lower_phis_to_scalar.c \
> >> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> >> index 65f75bd..7f7c459 100644
> >> --- a/src/compiler/nir/nir.h
> >> +++ b/src/compiler/nir/nir.h
> >> @@ -2172,6 +2172,7 @@ void nir_lower_io(nir_shader *shader,
> >> nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr);
> >> nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr);
> >>
> >> +void nir_lower_io_types(nir_shader *shader, int (*type_size)(const
> struct
> >> glsl_type *));
> >> void nir_lower_vars_to_ssa(nir_shader *shader);
> >>
> >> bool nir_remove_dead_variables(nir_shader *shader);
> >> diff --git a/src/compiler/nir/nir_lower_io_types.c
> >> b/src/compiler/nir/nir_lower_io_types.c
> >> new file mode 100644
> >> index 0000000..b7d9ebe
> >> --- /dev/null
> >> +++ b/src/compiler/nir/nir_lower_io_types.c
> >> @@ -0,0 +1,178 @@
> >> +/*
> >> + * Copyright © 2016 Red Hat
> >> + *
> >> + * Permission is hereby granted, free of charge, to any person
> obtaining
> >> a
> >> + * copy of this software and associated documentation files (the
> >> "Software"),
> >> + * to deal in the Software without restriction, including without
> >> limitation
> >> + * the rights to use, copy, modify, merge, publish, distribute,
> >> sublicense,
> >> + * and/or sell copies of the Software, and to permit persons to whom
> the
> >> + * Software is furnished to do so, subject to the following conditions:
> >> + *
> >> + * The above copyright notice and this permission notice (including the
> >> next
> >> + * paragraph) shall be included in all copies or substantial portions
> of
> >> the
> >> + * Software.
> >> + *
> >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> >> EXPRESS OR
> >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> >> MERCHANTABILITY,
> >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
> >> SHALL
> >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> >> OTHER
> >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> >> ARISING FROM,
> >> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> >> IN THE
> >> + * SOFTWARE.
> >> + */
> >> +
> >> +#include "nir.h"
> >> +#include "nir_builder.h"
> >> +
> >> +/* Lower complex (struct/array/mat) input and output vars to primitive
> >> types
> >> + * (vec4) for linking. All indirect input/output access should already
> >> be
> >> + * lowered (ie. nir_lower_io_to_temporaries).
> >> + */
> >> +
> >> +struct lower_io_types_state {
> >> + nir_shader *shader;
> >> + struct exec_list new_ins;
> >> + struct exec_list new_outs;
> >> + int (*type_size)(const struct glsl_type *);
> >> +};
> >> +
> >> +static nir_variable *
> >> +get_new_var(struct lower_io_types_state *state, nir_variable *var,
> >> unsigned off)
> >> +{
> >> + struct exec_list *list;
> >> +
> >> + if (var->data.mode == nir_var_shader_in) {
> >> + list = &state->new_ins;
> >> + } else {
> >> + assert(var->data.mode == nir_var_shader_out);
> >> + list = &state->new_outs;
> >> + }
> >> +
> >> + nir_foreach_variable(nvar, list) {
> >> + if (nvar->data.location == (var->data.location + off))
> >> + return nvar;
> >> + }
> >
> >
> > Doing a linear search here could get expensive. Why not just have an
> array
> > that maps locations to variables? I think you have a maximum of 64
> possible
> > locations (if you disregard tess) so it's not that memory-intensive.
>
> hmm, list of intputs and outputs is pretty small/bounded.. do you
> *really* expect it to be a problem?
>
No, it should be fine. It just bothers me a bit :-P
> >>
> >> +
> >> + /* doesn't already exist, so we need to create a new one: */
> >> + /* TODO figure out if scalar vs vec, and if float/int/uint/(double?)
> >> + * do we need to fixup interpolation mode for int vs float
> components
> >> + * of a struct, etc..
> >> + */
> >> + const struct glsl_type *ntype = glsl_vec4_type();
> >> + nir_variable *nvar = nir_variable_create(state->shader,
> >> var->data.mode,
> >> + ntype, NULL);
> >
> >
> > Do you want to create a new one or clone? Cloning seems better because
> that
> > would ensure that interp qualifiers etc. get copied over.
>
> I started out that way.. but decided to avoid it for some reason.. I
> think just to avoid the duplicate ralloc_asprintf() for the name, but
> there might have been another reason..
>
> > Also, I seem to recall it being possible for outputs to have constant
> > initializers coming out of GLSL IR. This pass doesn't handle that.
>
> hmm, I haven't come across any piglit that triggered this.. if you
> have some ideas for a .shader_test I could try..
>
I don't know of anything off-hand that triggers it. I just seem to recall
coming across it at some point. I guess one could put in an assert
somewhere and do a piglit run.
> >>
> >> +
> >> + nvar->name = ralloc_asprintf(nvar, "%s@%u", var->name, off);
> >> + nvar->data = var->data;
> >> + nvar->data.location += off;
> >> +
> >> + /* nir_variable_create is too clever for it's own good: */
> >> + exec_node_remove(&nvar->node);
> >> + exec_node_self_link(&nvar->node); /* no delinit() :-( */
> >
> >
> > This isn't needed. Re-insertion just stomps whatever was there before.
>
> maybe.. tbh I'd kinda prefer if _remove() spent the extra couple
> instructions to _self_link() rather than leaving booby-traps. List
> stuff can fail in spectacular fashion if you get that sort of thing
> wrong and extra couple instructions is lost in the noise. (Plus,
> everything is already in the cache at this point, so you'll have a
> hard time proving to me that you could measure a perf diff..)
>
Sure... I've made that argument on other linked list implementations in the
past. Feel free to send the patch and see what the reaction is :-p
> > Incidentally, if you had an array, you could just let
> nir_variable_create be
> > clever.
> >
> >>
> >> +
> >> + exec_list_push_tail(list, &nvar->node);
> >> +
> >> + /* remove existing var from input/output list: */
> >> + exec_node_remove(&var->node);
> >> + exec_node_self_link(&var->node);
> >
> >
> > This isn't needed either.
> >
> >>
> >> +
> >> + return nvar;
> >> +}
> >> +
> >> +static unsigned
> >> +get_deref_offset(struct lower_io_types_state *state, nir_deref *tail)
> >> +{
> >> + unsigned offset = 0;
> >> +
> >> + while (tail->child != NULL) {
> >> + const struct glsl_type *parent_type = tail->type;
> >> + tail = tail->child;
> >> +
> >> + if (tail->deref_type == nir_deref_type_array) {
> >> + nir_deref_array *deref_array = nir_deref_as_array(tail);
> >> +
> >> + /* indirect inputs/outputs should already be lowered! */
> >> + assert(deref_array->deref_array_type ==
> >> nir_deref_array_type_direct);
> >> +
> >> + unsigned size = state->type_size(tail->type);
> >> +
> >> + offset += size * deref_array->base_offset;
> >> + } else if (tail->deref_type == nir_deref_type_struct) {
> >> + nir_deref_struct *deref_struct = nir_deref_as_struct(tail);
> >> +
> >> + for (unsigned i = 0; i < deref_struct->index; i++) {
> >> + offset +=
> state->type_size(glsl_get_struct_field(parent_type,
> >> i));
> >> + }
> >> + }
> >> + }
> >> +
> >> + return offset;
> >> +}
> >> +
> >> +static bool
> >> +lower_io_types_block(nir_block *block, void *void_state)
> >> +{
> >> + struct lower_io_types_state *state = void_state;
> >> +
> >> + nir_foreach_instr(block, instr) {
> >> + if (instr->type != nir_instr_type_intrinsic)
> >> + continue;
> >> +
> >> + nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr);
> >> +
> >> + if ((intr->intrinsic != nir_intrinsic_load_var) &&
> >> + (intr->intrinsic != nir_intrinsic_store_var))
> >> + continue;
> >
> >
> > You should say somewhere in the top comment that copies need to also be
> > lowered prior to this pass.
>
> ok, I'll add a comment
>
> >>
> >> +
> >> + nir_variable *var = intr->variables[0]->var;
> >> +
> >> + if ((var->data.mode != nir_var_shader_in) &&
> >> + (var->data.mode != nir_var_shader_out))
> >> + continue;
> >> +
> >> + /* no need to split up if already primitive */
> >> + if (state->type_size(var->type) == 1)
> >> + continue;
> >> +
> >> + unsigned off = get_deref_offset(state,
> &intr->variables[0]->deref);
> >> + nir_variable *nvar = get_new_var(state, var, off);
> >> +
> >> + /* and then re-write the load/store_var deref: */
> >> + intr->variables[0] = nir_deref_var_create(intr, nvar);
> >> + }
> >> +
> >> + return true;
> >> +}
> >> +
> >> +static void
> >> +lower_io_types_impl(nir_function_impl *impl, struct
> lower_io_types_state
> >> *state)
> >> +{
> >> + nir_foreach_block(impl, lower_io_types_block, state);
> >> +
> >> + nir_metadata_preserve(impl, nir_metadata_block_index |
> >> + nir_metadata_dominance);
> >> +}
> >> +
> >> +
> >> +void
> >> +nir_lower_io_types(nir_shader *shader, int (*type_size)(const struct
> >> glsl_type *))
> >
> >
> > Just a side-note (not a request) but we really should have a
> > "glsl_type_size_func" typedef somewhere.
>
> not a bad idea.. although maybe I'll punt until gallium and i965
> type_size fxns are unified (in which case we could drop this arg)..
>
> > Also, what happened to just using nir_type_size_vec4? Incidentally, I
> think
> > you could also use the newly exposed component_slots function from my
> > series.
>
> After the lower_io patch, I didn't need the type_size fxn from
> freedreno/ir3. That and I punted on sorting out the small deltas
> between the gallium and i965 versions.. you seemed to thing some of
> those diffs might not matter, but I've had this series dragging on for
> long enough that I figured I'd avoid introducing dependencies on
> sorting that out. Not like there are a lot of call-sites for this
> pass, so I figured no real problem w/ dropping the type_size fxn ptr
> in a later patchset.
>
Sure. I don't want to block on that. I do think component_slots is
more-or-less the "correct" thing to do since it uses the exact same units
as var->data.location.
I'm not too worried about the other comments.
Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
> BR,
> -R
>
> >>
> >> +{
> >> + struct lower_io_types_state state;
> >> +
> >> + /* NOTE: only operates on units of vec4 slots: */
> >> + assert(type_size(glsl_vec4_type()) == 1);
> >> +
> >> + state.shader = shader;
> >> + exec_list_make_empty(&state.new_ins);
> >> + exec_list_make_empty(&state.new_outs);
> >> + state.type_size = type_size;
> >> +
> >> + nir_foreach_function(shader, function) {
> >> + if (function->impl)
> >> + lower_io_types_impl(function->impl, &state);
> >> + }
> >> +
> >> + /* move new in/out vars to shader's lists: */
> >> + exec_list_append(&shader->inputs, &state.new_ins);
> >> + exec_list_append(&shader->outputs, &state.new_outs);
> >> +}
> >> --
> >> 2.5.5
> >>
> >> _______________________________________________
> >> mesa-dev mailing list
> >> mesa-dev at lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160402/5603d4ab/attachment-0001.html>
More information about the mesa-dev
mailing list