[Mesa-dev] [PATCH 3/5] nir: Add new GS intrinsics that maintain a count of emitted vertices.

Jason Ekstrand jason at jlekstrand.net
Tue Sep 8 11:12:39 PDT 2015


On Tue, Sep 8, 2015 at 2:24 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Monday, September 07, 2015 10:51:37 AM Jason Ekstrand wrote:
>> On Sep 3, 2015 1:49 AM, "Kenneth Graunke" <kenneth at whitecape.org> wrote:
>> >
>> > This patch also introduces a lowering pass to convert the simple GS
>> > intrinsics to the new ones.  See the comments above that for the
>> > rationale behind the new intrinsics.
>> >
>> > This should be useful for i965; it's a generic enough mechanism that I
>> > could see other drivers potentially using it as well, so I don't feel
>> > too bad about putting it in the generic code.
>> >
>> > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>> > ---
>> >  src/glsl/Makefile.sources              |   1 +
>> >  src/glsl/nir/nir.h                     |   2 +
>> >  src/glsl/nir/nir_intrinsics.h          |  21 ++++
>> >  src/glsl/nir/nir_lower_gs_intrinsics.c | 214
>> +++++++++++++++++++++++++++++++++
>> >  4 files changed, 238 insertions(+)
>> >  create mode 100644 src/glsl/nir/nir_lower_gs_intrinsics.c
>> >
>> > diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
>> > index c422303..ae2e3e1 100644
>> > --- a/src/glsl/Makefile.sources
>> > +++ b/src/glsl/Makefile.sources
>> > @@ -36,6 +36,7 @@ NIR_FILES = \
>> >         nir/nir_lower_alu_to_scalar.c \
>> >         nir/nir_lower_atomics.c \
>> >         nir/nir_lower_global_vars_to_local.c \
>> > +       nir/nir_lower_gs_intrinsics.c \
>> >         nir/nir_lower_load_const_to_scalar.c \
>> >         nir/nir_lower_locals_to_regs.c \
>> >         nir/nir_lower_idiv.c \
>> > diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>> > index 570a33c..13f123f 100644
>> > --- a/src/glsl/nir/nir.h
>> > +++ b/src/glsl/nir/nir.h
>> > @@ -1809,6 +1809,8 @@ void nir_lower_idiv(nir_shader *shader);
>> >  void nir_lower_atomics(nir_shader *shader);
>> >  void nir_lower_to_source_mods(nir_shader *shader);
>> >
>> > +void nir_lower_gs_intrinsics(nir_shader *shader);
>> > +
>> >  void nir_normalize_cubemap_coords(nir_shader *shader);
>> >
>> >  void nir_live_variables_impl(nir_function_impl *impl);
>> > diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
>> > index ed309b6..bcfe70d 100644
>> > --- a/src/glsl/nir/nir_intrinsics.h
>> > +++ b/src/glsl/nir/nir_intrinsics.h
>> > @@ -79,9 +79,30 @@ BARRIER(memory_barrier)
>> >  /** A conditional discard, with a single boolean source. */
>> >  INTRINSIC(discard_if, 1, ARR(1), false, 0, 0, 0, 0)
>> >
>> > +/**
>> > + * Basic Geometry Shader intrinsics.
>> > + *
>> > + * emit_vertex implements GLSL's EmitStreamVertex() built-in.  It takes
>> a single
>> > + * index, which is the stream ID to write to.
>> > + *
>> > + * end_primitive implements GLSL's EndPrimitive() built-in.
>> > + */
>> >  INTRINSIC(emit_vertex,   0, ARR(), false, 0, 0, 1, 0)
>> >  INTRINSIC(end_primitive, 0, ARR(), false, 0, 0, 1, 0)
>> >
>> > +/**
>> > + * Geometry Shader intrinsics with a vertex count.
>> > + *
>> > + * Alternatively, drivers may implement these intrinsics, and use
>> > + * nir_lower_gs_intrinsics() to convert from the basic intrinsics.
>> > + *
>> > + * These maintain a count of the number of vertices emitted, as an
>> additional
>> > + * unsigned integer source.
>> > + */
>> > +INTRINSIC(emit_vertex_with_counter, 1, ARR(1), false, 0, 0, 1, 0)
>> > +INTRINSIC(end_primitive_with_counter, 1, ARR(1), false, 0, 0, 1, 0)
>> > +INTRINSIC(set_vertex_count, 1, ARR(1), false, 0, 0, 0, 0)
>> > +
>> >  /*
>> >   * Atomic counters
>> >   *
>> > diff --git a/src/glsl/nir/nir_lower_gs_intrinsics.c
>> b/src/glsl/nir/nir_lower_gs_intrinsics.c
>> > new file mode 100644
>> > index 0000000..b866d87
>> > --- /dev/null
>> > +++ b/src/glsl/nir/nir_lower_gs_intrinsics.c
>> > @@ -0,0 +1,214 @@
>> > +/*
>> > + * Copyright © 2015 Intel Corporation
>> > + *
>> > + * Permission is hereby granted, free of charge, to any person obtaining
>> a
>> > + * copy of this software and associated documentation files (the
>> "Software"),
>> > + * to deal in the Software without restriction, including without
>> limitation
>> > + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> > + * and/or sell copies of the Software, and to permit persons to whom the
>> > + * Software is furnished to do so, subject to the following conditions:
>> > + *
>> > + * The above copyright notice and this permission notice (including the
>> next
>> > + * paragraph) shall be included in all copies or substantial portions of
>> the
>> > + * Software.
>> > + *
>> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> SHALL
>> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
>> OTHER
>> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
>> ARISING
>> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> DEALINGS
>> > + * IN THE SOFTWARE.
>> > + *
>> > + * Authors:
>> > + *    Kenneth Graunke <kenneth at whitecape.org>
>> > + */
>> > +
>> > +#include "nir.h"
>> > +#include "nir_builder.h"
>> > +
>> > +/**
>> > + * \file nir_lower_gs_intrinsics.c
>> > + *
>> > + * Geometry Shaders can call EmitVertex()/EmitStreamVertex() to output an
>> > + * arbitrary number of vertices.  However, the shader must declare the
>> maximum
>> > + * number of vertices that it will ever output - further attempts to emit
>> > + * vertices result in undefined behavior according to the GLSL
>> specification.
>> > + *
>> > + * Drivers might use this maximum number of vertices to allocate enough
>> space
>> > + * to hold the geometry shader's output.  Some drivers (such as i965)
>> need to
>> > + * implement "safety checks" which ensure that the shader hasn't emitted
>> too
>> > + * many vertices, to avoid overflowing that space and trashing other
>> memory.
>> > + *
>> > + * The count of emitted vertices can also be useful in buffer offset
>> > + * calculations, so drivers know where to write the GS output.
>> > + *
>> > + * However, for simple geometry shaders that emit a statically
>> determinable
>> > + * number of vertices, this extra bookkeeping is unnecessary and
>> inefficient.
>> > + * By tracking the vertex count in NIR, we allow constant
>> folding/propagation
>> > + * and dead control flow optimizations to eliminate most of it where
>> possible.
>> > + *
>> > + * This pass introduces a new global variable which stores the current
>> vertex
>> > + * count (initialized to 0), and converts emit_vertex/end_primitive
>> intrinsics
>> > + * to their *_with_counter variants.  emit_vertex is also wrapped in a
>> safety
>> > + * check to avoid buffer overflows.  Finally, it adds a set_vertex_count
>> > + * intrinsic at the end of the program, informing the driver of the final
>> > + * vertex count.
>> > + */
>> > +
>> > +struct state {
>> > +   nir_builder *builder;
>> > +   nir_variable *vertex_count;
>> > +};
>> > +
>> > +/**
>> > + * Replace emit_vertex intrinsics with:
>> > + *
>> > + * if (vertex_count < max_vertices) {
>> > + *    emit_vertex_with_counter vertex_count ...
>> > + *    vertex_count += 1
>> > + * }
>> > + */
>> > +static void
>> > +rewrite_emit_vertex(nir_intrinsic_instr *intrin, struct state *state)
>>
>> I'm inclined to rename "intrin" to "emit". It may be easier to track with
>> the lowering code.
>
> Why "emit"?  Using a verb doesn't make much sense to me...and it's not
> even something we emit.  orig_intrin might be clearer...

Just because it's the emit instruction as opposed to an add or
something.  Given that we're lowering that instruction, instr/intrin
and lowered works fine too.  I don't care too much as long as the
aliases are gone.

>>
>> > +{
>> > +   nir_builder *b = state->builder;
>> > +   nir_variable *vertex_count_var = state->vertex_count;
>> > +   nir_instr *instr = &intrin->instr;
>>
>> Please get rid of these last two aliases ("b" is fine). They don't make
>> things much shorter and, IMHO, they actually make readability worse because
>> there are multiple names for things.
>
> Sure.
>
>> > +
>> > +   /* Load the vertex count */
>> > +   b->cursor = nir_before_instr(instr);
>> > +   nir_ssa_def *count = nir_load_var(b, vertex_count_var);
>> > +
>> > +   nir_ssa_def *max_vertices = nir_imm_int(b,
>> b->shader->gs.vertices_out);
>> > +
>> > +   /* Create: if (vertex_count < max_vertices) */
>> > +   nir_if *if_stmt = nir_if_create(b->shader);
>> > +   if_stmt->condition = nir_src_for_ssa(nir_ilt(b, count, max_vertices));
>> > +
>> > +   /* Replace the instruction.  The new if statement needs to be hooked
>> up
>> > +    * to the control flow graph before we start inserting instructions
>> in it.
>> > +    */
>> > +   nir_builder_cf_insert(b, &if_stmt->cf_node);
>> > +   nir_instr_remove(instr);
>>
>> I think we usually delete the instruction at the end.  Not that it matters
>> but it does make it more clear that you don't have dangling cursor problems.
>
> Sure, that seems reasonable.
>
>> > +
>> > +   /* Fill out the new then-block */
>> > +   b->cursor = nir_after_cf_list(&if_stmt->then_list);
>> > +
>> > +   nir_intrinsic_instr *lowered =
>> > +      nir_intrinsic_instr_create(b->shader,
>> > +                                 nir_intrinsic_emit_vertex_with_counter);
>> > +   lowered->const_index[0] = intrin->const_index[0];
>> > +   lowered->src[0] = nir_src_for_ssa(count);
>> > +   nir_builder_instr_insert(b, &lowered->instr);
>> > +
>> > +   /* Increment the vertex count by 1 */
>> > +   nir_store_var(b, vertex_count_var, nir_iadd(b, count, nir_imm_int(b,
>> 1)));
>> > +}
>> > +
>> > +/**
>> > + * Replace end_primitive with end_primitive_with_counter.
>> > + */
>> > +static void
>> > +rewrite_end_primitive(nir_intrinsic_instr *intrin, struct state *state)
>> > +{
>> > +   nir_builder *b = state->builder;
>> > +
>> > +   b->cursor = nir_before_instr(&intrin->instr);
>> > +   nir_ssa_def *count = nir_load_var(b, state->vertex_count);
>> > +
>> > +   nir_intrinsic_instr *lowered =
>> > +      nir_intrinsic_instr_create(b->shader,
>> > +
>>  nir_intrinsic_end_primitive_with_counter);
>> > +   lowered->const_index[0] = intrin->const_index[0];
>> > +   lowered->src[0] = nir_src_for_ssa(count);
>> > +   nir_builder_instr_insert(b, &lowered->instr);
>> > +
>> > +   nir_instr_remove(&intrin->instr);
>> > +}
>> > +
>> > +static bool
>> > +rewrite_intrinsics(nir_block *block, void *closure)
>> > +{
>> > +   struct state *state = closure;
>> > +
>> > +   nir_foreach_instr_safe(block, instr) {
>> > +      if (instr->type != nir_instr_type_intrinsic)
>> > +         continue;
>> > +
>> > +      nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
>> > +      switch (intrin->intrinsic) {
>> > +      case nir_intrinsic_emit_vertex:
>> > +         rewrite_emit_vertex(intrin, state);
>> > +         break;
>> > +      case nir_intrinsic_end_primitive:
>> > +         rewrite_end_primitive(intrin, state);
>> > +         break;
>> > +      default:
>> > +         /* not interesting; skip this */
>> > +         break;
>> > +      }
>> > +   }
>> > +
>> > +   return true;
>> > +}
>> > +
>> > +/**
>> > + * Add a set_vertex_count intrinsic at the end of the program
>> > + * (representing the final vertex count).
>> > + */
>> > +static void
>> > +append_set_vertex_count(nir_block *end_block, struct state *state)
>> > +{
>> > +   nir_variable *vertex_count_var = state->vertex_count;
>> > +   nir_builder *b = state->builder;
>> > +   nir_shader *shader = state->builder->shader;
>> > +
>> > +   /* Insert the new intrinsic in all of the predecessors of the end
>> block,
>> > +    * but before any jump instructions (return).
>> > +    */
>> > +   struct set_entry *entry;
>> > +   set_foreach(end_block->predecessors, entry) {
>> > +      nir_block *pred = (nir_block *) entry->key;
>> > +      b->cursor = nir_after_block(pred);
>>
>> Technically, this should be after_block_before_jump().  We should probably
>> just land the patch that adds that.  I'll send them out.
>
> Ah, thanks!  Long ago I was using a builder function which accomplished
> that, but in the new cursor era I messed that up.  I'll be happy to use
> your new function.
>
>>
>> > +
>> > +      nir_ssa_def *count = nir_load_var(b, vertex_count_var);
>> > +
>> > +      nir_intrinsic_instr *set_vertex_count =
>> > +         nir_intrinsic_instr_create(shader,
>> nir_intrinsic_set_vertex_count);
>> > +      set_vertex_count->src[0] = nir_src_for_ssa(count);
>> > +
>> > +      nir_builder_instr_insert(b, &set_vertex_count->instr);
>> > +   }
>> > +}
>> > +
>> > +void
>> > +nir_lower_gs_intrinsics(nir_shader *shader)
>> > +{
>> > +   struct state state;
>> > +
>> > +   /* Create the counter variable */
>> > +   nir_variable *var = rzalloc(shader, nir_variable);
>> > +   var->data.mode = nir_var_global;
>> > +   var->type = glsl_uint_type();
>> > +   var->name = "vertex_count";
>> > +   var->constant_initializer = rzalloc(shader, nir_constant); /*
>> initialize to 0 */
>> > +
>> > +   exec_list_push_tail(&shader->globals, &var->node);
>> > +   state.vertex_count = var;
>> > +
>> > +   nir_foreach_overload(shader, overload) {
>> > +      if (overload->impl) {
>> > +         nir_builder b;
>> > +         nir_builder_init(&b, overload->impl);
>> > +         state.builder = &b;
>> > +
>> > +         nir_foreach_block(overload->impl, rewrite_intrinsics, &state);
>> > +
>> > +         /* This only works because we have a single main() function. */
>>
>> Then why don't you predicate it on name == "main" or something?  Really, we
>> should probably just add an is_entrypoint bool to nir_function_impl and
>> switch on that.  Another option would be to make this take a function impl
>> instead of a shader and trust in function inlining (maybe even assert no
>> function calls).
>
> I don't know.  We really only have the pretense of functions today, but
> they're not an actual thing.  There will only ever be one, so it seems
> a bit pointless to check for main or add flags indicating the one and
> only thing is the thing we want...
>
> Honestly, I've wondered if we should just delete functions from NIR
> altogether, though I suppose if we had code that arrived through some
> means other than GLSL IR, or moved toward NIR being a thing before final
> linking is finished...then they could be useful.

Yeah, we should keep them.  However, there are a lot of passes that
should probably take a nir_function_impl rather than a nir_shader and
the backend should probably work more-or-less in terms of a
nir_function_impl.  This is going to take a bit more thinking to be
honest.


More information about the mesa-dev mailing list