<p dir="ltr"><br>
On Sep 10, 2015 7:39 PM, "Jason Ekstrand" <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote:<br>
><br>
> On Thu, Sep 10, 2015 at 2:39 PM, Rob Clark <<a href="mailto:robdclark@gmail.com">robdclark@gmail.com</a>> wrote:<br>
> > From: Rob Clark <<a href="mailto:robclark@freedesktop.org">robclark@freedesktop.org</a>><br>
> ><br>
> > So this is basically working as a lowering pass for handling user-clip-<br>
> > planes, and frag-shader emulation of CLIPDIST for hardware that needs<br>
> > to handle this in the shader.<br>
> ><br>
> > For user-clip-planes, instructions are inserted to calculate CLIPDIST<br>
> > in the VS.  And in both cases, discard_if's are inserted in FS to<br>
> > handle the clipping.<br>
> ><br>
> > NOTE: This currently requires a bit of a hack in nir_validate, which<br>
> > is unhappy about things like:<br>
> ><br>
> >         decl_var uniform  vec4[10] uniform_0 (0, 0)<br>
> >         decl_var shader_out  vec4 out_0 (0, 0)<br>
> >         decl_var shader_out  vec4 clipdist_1 (17, 1)<br>
> >         decl_var shader_out  vec4 clipdist_2 (18, 2)<br>
> >         decl_overload main returning void<br>
> ><br>
> >         impl main {<br>
> >                 block block_0:<br>
> >                 /* preds: */<br>
> >                 ...<br>
> >                 /* succs: block_1 */<br>
> >                 block block_1:<br>
> >                 /* preds: block_0 */<br>
> >                 ... inserted CLIPDIST calculations ...<br>
> >                 /* succs: block_2 */<br>
> >                 block block_2:<br>
> >         }<br>
> ><br>
> > The thing is, repurposing the end_block and creating a new end_block,<br>
> > without having to consider all the different potential successors,<br>
> > seems like by far the easiest way to insert code at the end of a<br>
> > shader.  So probably useful in other cases.  (I do something similar<br>
> > internally in ir3 for transform-feedback).<br>
> ><br>
> > Seems like the easiest thing would be to relax the restriction in<br>
> > nir_validate, but I'm not sure if this will break other assumptions<br>
> > elsewhere.<br>
><br>
> Yes it does break assumptions.  In particular, it breaks the<br>
> assumption that the end-block is empty :-)  In all seriousness though,<br>
> the idea is that once you jump to the end, you're done.  The end block<br>
> is the place where returns go.  You can't put code after a return<br>
> (Ok, yeah, exception handling and destructors etc...).</p>
<p dir="ltr">No, I insert a new end_block, so that particular assumption should be safe... I was just wondering if there was some less obvious assumption..<br></p>
<p dir="ltr">> That said, you're in TGSI land where function inlining and early<br>
> return lowering has already happened so there are no early returns.<br>
> This means that the end block has exactly one predecessor and you can<br>
> put your stuff there.  If you want to make it obvious, feel free to<br>
> add a nir_end_of_impl constructor for nir_cursor which asserts that<br>
> there is exactly one predecessor and calls nir_after_block_before_jump<br>
> on that block.  That would make it both easy and correct.<br>
><br>
> --Jason<br>
><br>
> > TODO: sort out hack in nir_validate, reindent, etc.<br>
> ><br>
> > ---<br>
> >  src/glsl/Makefile.sources     |   1 +<br>
> >  src/glsl/nir/nir.h            |   3 +<br>
> >  src/glsl/nir/nir_lower_clip.c | 346 ++++++++++++++++++++++++++++++++++++++++++<br>
> >  src/glsl/nir/nir_validate.c   |   3 +-<br>
> >  4 files changed, 351 insertions(+), 2 deletions(-)<br>
> >  create mode 100644 src/glsl/nir/nir_lower_clip.c<br>
> ><br>
> > diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources<br>
> > index da7fdf9..5d98b7b 100644<br>
> > --- a/src/glsl/Makefile.sources<br>
> > +++ b/src/glsl/Makefile.sources<br>
> > @@ -35,6 +35,7 @@ NIR_FILES = \<br>
> >         nir/nir_live_variables.c \<br>
> >         nir/nir_lower_alu_to_scalar.c \<br>
> >         nir/nir_lower_atomics.c \<br>
> > +       nir/nir_lower_clip.c \<br>
> >         nir/nir_lower_global_vars_to_local.c \<br>
> >         nir/nir_lower_load_const_to_scalar.c \<br>
> >         nir/nir_lower_locals_to_regs.c \<br>
> > diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h<br>
> > index 2ba7731..8544140 100644<br>
> > --- a/src/glsl/nir/nir.h<br>
> > +++ b/src/glsl/nir/nir.h<br>
> > @@ -1811,6 +1811,9 @@ void nir_lower_system_values(nir_shader *shader);<br>
> >  void nir_lower_tex_projector(nir_shader *shader);<br>
> >  void nir_lower_idiv(nir_shader *shader);<br>
> ><br>
> > +void nir_lower_clip_vs(nir_shader *shader, unsigned ucp_enables);<br>
> > +void nir_lower_clip_fs(nir_shader *shader, unsigned ucp_enables);<br>
> > +<br>
> >  void nir_lower_atomics(nir_shader *shader);<br>
> >  void nir_lower_to_source_mods(nir_shader *shader);<br>
> ><br>
> > diff --git a/src/glsl/nir/nir_lower_clip.c b/src/glsl/nir/nir_lower_clip.c<br>
> > new file mode 100644<br>
> > index 0000000..54ebe7b<br>
> > --- /dev/null<br>
> > +++ b/src/glsl/nir/nir_lower_clip.c<br>
> > @@ -0,0 +1,346 @@<br>
> > +/*<br>
> > + * Copyright © 2015 Red Hat<br>
> > + *<br>
> > + * Permission is hereby granted, free of charge, to any person obtaining a<br>
> > + * copy of this software and associated documentation files (the "Software"),<br>
> > + * to deal in the Software without restriction, including without limitation<br>
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,<br>
> > + * and/or sell copies of the Software, and to permit persons to whom the<br>
> > + * Software is furnished to do so, subject to the following conditions:<br>
> > + *<br>
> > + * The above copyright notice and this permission notice (including the next<br>
> > + * paragraph) shall be included in all copies or substantial portions of the<br>
> > + * Software.<br>
> > + *<br>
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR<br>
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,<br>
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL<br>
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER<br>
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING<br>
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS<br>
> > + * IN THE SOFTWARE.<br>
> > + *<br>
> > + * Authors:<br>
> > + *    Rob Clark <<a href="mailto:robclark@freedesktop.org">robclark@freedesktop.org</a>><br>
> > + */<br>
> > +<br>
> > +#include "nir.h"<br>
> > +#include "nir_builder.h"<br>
> > +<br>
> > +#define MAX_CLIP_PLANES 8<br>
> > +<br>
> > +/* Generates the lowering code for user-clip-planes, generating CLIPDIST<br>
> > + * from UCP[n] + CLIPVERTEX or POSITION.  Additionally, an optional pass<br>
> > + * for fragment shaders to insert conditional kill's based on the inter-<br>
> > + * polated CLIPDIST<br>
> > + *<br>
> > + * NOTE: should be run after nir_lower_outputs_to_temporaries() (or at<br>
> > + * least in scenarios where you can count on each output written once<br>
> > + * and only once).<br>
> > + */<br>
> > +<br>
> > +<br>
> > +static nir_variable *<br>
> > +create_clipdist_var(nir_shader *shader, unsigned drvloc,<br>
> > +               bool output, gl_varying_slot slot)<br>
> > +{<br>
> > +       nir_variable *var = rzalloc(shader, nir_variable);<br>
> > +<br>
> > +       var->data.driver_location = drvloc;<br>
> > +       var->type = glsl_vec4_type();<br>
> > +       var->data.mode = output ? nir_var_shader_out : nir_var_shader_in;<br>
> > +       var->name = ralloc_asprintf(var, "clipdist_%d", drvloc);<br>
> > +       var->data.index = 0;<br>
> > +       var->data.location = slot;<br>
> > +<br>
> > +       if (output) {<br>
> > +               exec_list_push_tail(&shader->outputs, &var->node);<br>
> > +       } else {<br>
> > +               exec_list_push_tail(&shader->inputs, &var->node);<br>
> > +       }<br>
> > +       return var;<br>
> > +}<br>
> > +<br>
> > +static void<br>
> > +store_clipdist_output(nir_builder *b, nir_variable *out, nir_ssa_def **val)<br>
> > +{<br>
> > +       nir_intrinsic_instr *store;<br>
> > +<br>
> > +       store = nir_intrinsic_instr_create(b->shader, nir_intrinsic_store_output);<br>
> > +       store->num_components = 4;<br>
> > +       store->const_index[0] = out->data.driver_location;<br>
> > +       store->src[0].ssa = nir_vec4(b, val[0], val[1], val[2], val[3]);<br>
> > +       store->src[0].is_ssa = true;<br>
> > +       nir_builder_instr_insert(b, &store->instr);<br>
> > +}<br>
> > +<br>
> > +static void<br>
> > +load_clipdist_input(nir_builder *b, nir_variable *in, nir_ssa_def **val)<br>
> > +{<br>
> > +       nir_intrinsic_instr *load;<br>
> > +<br>
> > +       load = nir_intrinsic_instr_create(b->shader, nir_intrinsic_load_input);<br>
> > +       load->num_components = 4;<br>
> > +       load->const_index[0] = in->data.driver_location;<br>
> > +       nir_ssa_dest_init(&load->instr, &load->dest, 4, NULL);<br>
> > +       nir_builder_instr_insert(b, &load->instr);<br>
> > +<br>
> > +       val[0] = nir_channel(b, &load->dest.ssa, 0);<br>
> > +       val[1] = nir_channel(b, &load->dest.ssa, 1);<br>
> > +       val[2] = nir_channel(b, &load->dest.ssa, 2);<br>
> > +       val[3] = nir_channel(b, &load->dest.ssa, 3);<br>
> > +}<br>
> > +<br>
> > +struct find_output_state {<br>
> > +       unsigned drvloc;<br>
> > +       nir_ssa_def *def;<br>
> > +};<br>
> > +<br>
> > +static bool<br>
> > +find_output_in_block(nir_block *block, void *void_state)<br>
> > +{<br>
> > +       struct find_output_state *state = void_state;<br>
> > +       nir_foreach_instr(block, instr) {<br>
> > +<br>
> > +               if (instr->type == nir_instr_type_intrinsic) {<br>
> > +                       nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr);<br>
> > +                       if ((intr->intrinsic == nir_intrinsic_store_output) &&<br>
> > +                                       intr->const_index[0] == state->drvloc) {<br>
> > +                               assert(state->def == NULL);<br>
> > +                               assert(intr->src[0].is_ssa);<br>
> > +                               state->def = intr->src[0].ssa;<br>
> > +                       }<br>
> > +               }<br>
> > +<br>
> > +#if !defined(DEBUG)<br>
> > +               /* for debug builds, scan entire shader to assert<br>
> > +                * if output is written multiple times.  For release<br>
> > +                * builds just assume all is well and bail when we<br>
> > +                * find first:<br>
> > +                */<br>
> > +               if (state->def)<br>
> > +                       return false;<br>
> > +#endif<br>
> > +       }<br>
> > +<br>
> > +       return true;<br>
> > +}<br>
> > +<br>
> > +/* TODO: maybe this would be a useful helper?<br>
> > + * NOTE: assumes each output is written exactly once (and unconditionally)<br>
> > + * so if needed nir_lower_outputs_to_temporaries()<br>
> > + */<br>
> > +static nir_ssa_def *<br>
> > +find_output(nir_shader *shader, unsigned drvloc)<br>
> > +{<br>
> > +       struct find_output_state state = {<br>
> > +               .drvloc = drvloc,<br>
> > +       };<br>
> > +<br>
> > +       nir_foreach_overload(shader, overload) {<br>
> > +               if (overload->impl) {<br>
> > +                       nir_foreach_block_reverse(overload->impl,<br>
> > +                                       find_output_in_block, &state);<br>
> > +               }<br>
> > +       }<br>
> > +<br>
> > +       return state.def;<br>
> > +}<br>
> > +<br>
> > +/*<br>
> > + * VS lowering<br>
> > + */<br>
> > +<br>
> > +static void<br>
> > +lower_clip_vs(nir_function_impl *impl, unsigned ucp_enables,<br>
> > +               nir_ssa_def *cv, nir_variable **out)<br>
> > +{<br>
> > +       nir_ssa_def *clipdist[MAX_CLIP_PLANES];<br>
> > +       nir_builder b;<br>
> > +<br>
> > +       nir_builder_init(&b, impl);<br>
> > +<br>
> > +       /* Push original end_block to end of body.  Since all other<br>
> > +        * blocks flow into end_block, it is a convenient place to<br>
> > +        * insert our new clipdist calculations.  A new empty<br>
> > +        * end_block is created:<br>
> > +        *<br>
> > +        * NOTE: by default nir_validate is unhappy about formations<br>
> > +        * such as<br>
> > +        *<br>
> > +        *     block block_0:<br>
> > +        *     // preds:<br>
> > +        *     ...<br>
> > +        *     // succs: block_1<br>
> > +        *     block block_1:<br>
> > +        *     // preds: block_0<br>
> > +        *     ...<br>
> > +        *     // succs: block_2<br>
> > +        *     block block_2:<br>
> > +        *     ...<br>
> > +        */<br>
> > +       exec_list_push_tail(&impl->body, &impl->end_block->cf_node.node);<br>
> > +       nir_block *new_end_block = nir_block_create(b.shader);<br>
> > +       impl->end_block->successors[0] = new_end_block;<br>
> > +       _mesa_set_add(new_end_block->predecessors, impl->end_block);<br>
> > +       impl->end_block = new_end_block;<br>
> > +       b.cursor = nir_after_cf_list(&impl->body);<br>
> > +<br>
> > +       for (int plane = 0; plane < MAX_CLIP_PLANES; plane++) {<br>
> > +               if (ucp_enables & (1 << plane)) {<br>
> > +                       nir_intrinsic_instr *ucp;<br>
> > +<br>
> > +                       /* insert intrinsic to fetch ucp[plane]: */<br>
> > +                       ucp = nir_intrinsic_instr_create(b.shader,<br>
> > +                                       nir_intrinsic_load_user_clip_plane);<br>
> > +                       ucp->num_components = 4;<br>
> > +                       ucp->const_index[0] = plane;<br>
> > +                       nir_ssa_dest_init(&ucp->instr, &ucp->dest, 4, NULL);<br>
> > +                       nir_builder_instr_insert(&b, &ucp->instr);<br>
> > +<br>
> > +                       /* calculate clipdist[plane] - dot(ucp, cv): */<br>
> > +                       clipdist[plane] = nir_fdot3(&b, &ucp->dest.ssa, cv);<br>
> > +               } else {<br>
> > +                       /* 0.0 == don't-clip == disabled: */<br>
> > +                       clipdist[plane] = nir_imm_float(&b, 0.0);<br>
> > +               }<br>
> > +       }<br>
> > +<br>
> > +       if (ucp_enables & 0x0f)<br>
> > +               store_clipdist_output(&b, out[0], &clipdist[0]);<br>
> > +       if (ucp_enables & 0xf0)<br>
> > +               store_clipdist_output(&b, out[1], &clipdist[4]);<br>
> > +<br>
> > +       nir_metadata_preserve(impl, nir_metadata_dominance);<br>
> > +}<br>
> > +<br>
> > +/* ucp_enables is bitmask of enabled ucp's.  Actual ucp values are<br>
> > + * passed in to shader via user_clip_plane system-values<br>
> > + */<br>
> > +void<br>
> > +nir_lower_clip_vs(nir_shader *shader, unsigned ucp_enables)<br>
> > +{<br>
> > +       int clipvertex = -1;<br>
> > +       int position = -1;<br>
> > +       int maxloc = -1;<br>
> > +       nir_ssa_def *cv;<br>
> > +       nir_variable *out[2];<br>
> > +<br>
> > +       if (!ucp_enables)<br>
> > +               return;<br>
> > +<br>
> > +       /* find clipvertex/position outputs: */<br>
> > +       foreach_list_typed(nir_variable, var, node, &shader->outputs) {<br>
> > +               int loc = var->data.driver_location;<br>
> > +<br>
> > +               /* keep track of last used driver-location.. we'll be<br>
> > +                * appending CLIP_DIST0/CLIP_DIST1 after last existing<br>
> > +                * output:<br>
> > +                */<br>
> > +               maxloc = MAX2(maxloc, loc);<br>
> > +<br>
> > +               switch (var->data.location) {<br>
> > +               case VARYING_SLOT_POS:<br>
> > +                       position = loc;<br>
> > +                       break;<br>
> > +               case VARYING_SLOT_CLIP_VERTEX:<br>
> > +                       clipvertex = loc;<br>
> > +                       break;<br>
> > +               case VARYING_SLOT_CLIP_DIST0:<br>
> > +               case VARYING_SLOT_CLIP_DIST1:<br>
> > +                       /* if shader is already writing CLIPDIST, then<br>
> > +                        * there should be no user-clip-planes to deal<br>
> > +                        * with.<br>
> > +                        */<br>
> > +                       return;<br>
> > +               }<br>
> > +       }<br>
> > +<br>
> > +       if (clipvertex != -1)<br>
> > +               cv = find_output(shader, clipvertex);<br>
> > +       else if (position != -1)<br>
> > +               cv = find_output(shader, position);<br>
> > +       else<br>
> > +               return;<br>
> > +<br>
> > +       /* insert CLIPDIST outputs: */<br>
> > +       if (ucp_enables & 0x0f)<br>
> > +               out[0] = create_clipdist_var(shader, ++maxloc, true, VARYING_SLOT_CLIP_DIST0);<br>
> > +       if (ucp_enables & 0xf0)<br>
> > +               out[1] = create_clipdist_var(shader, ++maxloc, true, VARYING_SLOT_CLIP_DIST1);<br>
> > +<br>
> > +       nir_foreach_overload(shader, overload) {<br>
> > +               if (!strcmp(overload->function->name, "main"))<br>
> > +                       lower_clip_vs(overload->impl, ucp_enables, cv, out);<br>
> > +       }<br>
> > +}<br>
> > +<br>
> > +/*<br>
> > + * FS lowering<br>
> > + */<br>
> > +<br>
> > +static void<br>
> > +lower_clip_fs(nir_function_impl *impl, unsigned ucp_enables,<br>
> > +               nir_variable **in)<br>
> > +{<br>
> > +       nir_ssa_def *clipdist[MAX_CLIP_PLANES];<br>
> > +       nir_builder b;<br>
> > +<br>
> > +       nir_builder_init(&b, impl);<br>
> > +       b.cursor = nir_before_cf_list(&impl->body);<br>
> > +<br>
> > +       if (ucp_enables & 0x0f)<br>
> > +               load_clipdist_input(&b, in[0], &clipdist[0]);<br>
> > +       if (ucp_enables & 0xf0)<br>
> > +               load_clipdist_input(&b, in[1], &clipdist[4]);<br>
> > +<br>
> > +       for (int plane = 0; plane < MAX_CLIP_PLANES; plane++) {<br>
> > +               if (ucp_enables & (1 << plane)) {<br>
> > +                       nir_intrinsic_instr *discard;<br>
> > +                       nir_ssa_def *cond;<br>
> > +<br>
> > +                       cond = nir_flt(&b, clipdist[plane], nir_imm_float(&b, 0.0));<br>
> > +<br>
> > +                       discard = nir_intrinsic_instr_create(b.shader,<br>
> > +                                       nir_intrinsic_discard_if);<br>
> > +                       discard->src[0] = nir_src_for_ssa(cond);<br>
> > +                       nir_builder_instr_insert(&b, &discard->instr);<br>
> > +               }<br>
> > +       }<br>
> > +}<br>
> > +<br>
> > +/* insert conditional kill based on interpolated CLIPDIST<br>
> > + */<br>
> > +void<br>
> > +nir_lower_clip_fs(nir_shader *shader, unsigned ucp_enables)<br>
> > +{<br>
> > +       nir_variable *in[2];<br>
> > +       int maxloc = -1;<br>
> > +<br>
> > +       if (!ucp_enables)<br>
> > +               return;<br>
> > +<br>
> > +       foreach_list_typed(nir_variable, var, node, &shader->inputs) {<br>
> > +               int loc = var->data.driver_location;<br>
> > +<br>
> > +               /* keep track of last used driver-location.. we'll be<br>
> > +                * appending CLIP_DIST0/CLIP_DIST1 after last existing<br>
> > +                * input:<br>
> > +                */<br>
> > +               maxloc = MAX2(maxloc, loc);<br>
> > +       }<br>
> > +<br>
> > +       /* The shader won't normally have CLIPDIST inputs, so we<br>
> > +        * must add our own:<br>
> > +        */<br>
> > +       /* insert CLIPDIST outputs: */<br>
> > +       if (ucp_enables & 0x0f)<br>
> > +               in[0] = create_clipdist_var(shader, ++maxloc, false, VARYING_SLOT_CLIP_DIST0);<br>
> > +       if (ucp_enables & 0xf0)<br>
> > +               in[1] = create_clipdist_var(shader, ++maxloc, false, VARYING_SLOT_CLIP_DIST1);<br>
> > +<br>
> > +       nir_foreach_overload(shader, overload) {<br>
> > +               if (!strcmp(overload->function->name, "main"))<br>
> > +                       lower_clip_fs(overload->impl, ucp_enables, in);<br>
> > +       }<br>
> > +}<br>
> > diff --git a/src/glsl/nir/nir_validate.c b/src/glsl/nir/nir_validate.c<br>
> > index 9938c0e..f682575 100644<br>
> > --- a/src/glsl/nir/nir_validate.c<br>
> > +++ b/src/glsl/nir/nir_validate.c<br>
> > @@ -667,8 +667,7 @@ validate_block(nir_block *block, validate_state *state)<br>
> >                     nir_if_first_then_node(if_stmt));<br>
> >              assert(&block->successors[1]->cf_node ==<br>
> >                     nir_if_first_else_node(if_stmt));<br>
> > -         } else {<br>
> > -            assert(next->type == nir_cf_node_loop);<br>
> > +         } else if (next->type == nir_cf_node_loop) {<br>
> >              nir_loop *loop = nir_cf_node_as_loop(next);<br>
> >              assert(&block->successors[0]->cf_node ==<br>
> >                     nir_loop_first_cf_node(loop));<br>
> > --<br>
> > 2.4.3<br>
> ><br>
> > _______________________________________________<br>
> > mesa-dev mailing list<br>
> > <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
> > <a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">http://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</p>