[Mesa-dev] [PATCH] glsl: Lower constant arrays to uniform arrays.

Ian Romanick idr at freedesktop.org
Wed Oct 15 18:50:49 PDT 2014


I like the idea (when you suggested it in April), but I think this
implementation has some issues.  I believe making the constant array
into a uniform will cause it to be shown by the GL API (e.g., returned
by glGetActiveUniform) and counted against uniform limits.  Since I
doubt we have any piglit tests that explicitly check this, it's hard to
say with 100% certainty.

We already have some mechanism that prevents built-in uniforms from
being advertised by glGetActiveUniform, so that part shouldn't be too
hard to fix.  I'm not sure whether this should be counted against
uniform limits or not.

Also... if a shader has a mix of constant and non-constant indexing of
the array, will the constant indexed accesses still get constant propagated?

On 10/15/2014 05:32 PM, Kenneth Graunke wrote:
> Consider GLSL code such as:
> 
>    const ivec2 offsets[] =
>       ivec2[](ivec2(-1, -1), ivec2(-1, 0), ivec2(-1, 1),
>               ivec2(0, -1),  ivec2(0, 0),  ivec2(0, 1),
>               ivec2(1, -1),  ivec2(1, 0),  ivec2(1, 1));
> 
>    ivec2 offset = offsets[<non-constant expression>];
> 
> Both i965 and nv50 currently handle this very poorly.  On i965, this
> becomes a pile of MOVs to load the immediate constants into registers,
> a pile of scratch writes to move the whole array to memory, and one
> scratch read to actually access the value - effectively the same as if
> it were a non-constant array.
> 
> We'd much rather upload large blocks of constant data as uniform data,
> so drivers can simply upload the data via constbufs, and not have to
> populate it via shader instructions.
> 
> This is currently non-optional because both i965 and nouveau benefit
> from it - we can revisit that if another driver actually benefits.
> 
> Improves performance in a terrain rendering microbenchmark by about 2x,
> and cuts the number of instructions in about half.  Helps a lot of
> "Natural Selection 2" shaders, as well as one "HOARD" shader.
> 
> total instructions in shared programs: 5473459 -> 5471765 (-0.03%)
> instructions in affected programs:     5880 -> 4186 (-28.81%)
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77957
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> 
> Reviewers: suggestions for better max_array_access handling are welcome.
> Reviewers: This changes the const-ness of the expression.  Is that a problem?

Maybe... but I think this happens late enough that it doesn't matter.
We'd need some tests like:

const int a[] = int[](-1, -1, 2, -1);
uniform vec4 b[a[2]];  // size must be constant expr.
const int c = a[3];    // initializer must be constant expr.

and other cases where only constant expressions can be used.

> ---
>  src/glsl/Makefile.sources                   |   1 +
>  src/glsl/ir_optimization.h                  |   1 +
>  src/glsl/linker.cpp                         |   2 +
>  src/glsl/lower_const_arrays_to_uniforms.cpp | 101 ++++++++++++++++++++++++++++
>  4 files changed, 105 insertions(+)
>  create mode 100644 src/glsl/lower_const_arrays_to_uniforms.cpp
> 
> I've had this patch sitting around since April, and been pondering whether
> we should improve it somehow.  But...it helps certain shaders a ton, and I
> haven't seen anything hurt by it.  So I'm wondering if we should just land
> it; we can always improve things later.
> 
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index 0c55327..6aed52d 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -58,6 +58,7 @@ LIBGLSL_FILES = \
>  	$(GLSL_SRCDIR)/loop_analysis.cpp \
>  	$(GLSL_SRCDIR)/loop_controls.cpp \
>  	$(GLSL_SRCDIR)/loop_unroll.cpp \
> +	$(GLSL_SRCDIR)/lower_const_arrays_to_uniforms.cpp \
>  	$(GLSL_SRCDIR)/lower_clip_distance.cpp \
>  	$(GLSL_SRCDIR)/lower_discard.cpp \
>  	$(GLSL_SRCDIR)/lower_discard_flow.cpp \
> diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
> index e25857a..34e0b4b 100644
> --- a/src/glsl/ir_optimization.h
> +++ b/src/glsl/ir_optimization.h
> @@ -114,6 +114,7 @@ bool lower_noise(exec_list *instructions);
>  bool lower_variable_index_to_cond_assign(exec_list *instructions,
>      bool lower_input, bool lower_output, bool lower_temp, bool lower_uniform);
>  bool lower_quadop_vector(exec_list *instructions, bool dont_lower_swz);
> +bool lower_const_arrays_to_uniforms(exec_list *instructions);
>  bool lower_clip_distance(gl_shader *shader);
>  void lower_output_reads(exec_list *instructions);
>  bool lower_packing_builtins(exec_list *instructions, int op_mask);
> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> index 47a722d..2a69b78 100644
> --- a/src/glsl/linker.cpp
> +++ b/src/glsl/linker.cpp
> @@ -2692,6 +2692,8 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
>                                      &ctx->Const.ShaderCompilerOptions[i],
>                                      ctx->Const.NativeIntegers))
>  	 ;
> +
> +      lower_const_arrays_to_uniforms(prog->_LinkedShaders[i]->ir);
>     }
>  
>     /* Check and validate stream emissions in geometry shaders */
> diff --git a/src/glsl/lower_const_arrays_to_uniforms.cpp b/src/glsl/lower_const_arrays_to_uniforms.cpp
> new file mode 100644
> index 0000000..2085086
> --- /dev/null
> +++ b/src/glsl/lower_const_arrays_to_uniforms.cpp
> @@ -0,0 +1,101 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +/**
> + * \file lower_const_arrays_to_uniforms.cpp
> + *
> + * Lower constant arrays to uniform arrays.
> + *
> + * Some driver backends (such as i965 and nouveau) don't handle constant arrays
> + * gracefully, instead treating them as ordinary writable temporary arrays.
> + * Since arrays can be large, this often means spilling them to scratch memory,
> + * which usually involves a large number of instructions.
> + *
> + * This must be called prior to link_set_uniform_initializers(); we need the
> + * linker to process our new uniform's constant initializer.
> + *
> + * This should be called after optimizations, since those can result in
> + * splitting and removing arrays that are indexed by constant expressions.
> + */
> +#include "ir.h"
> +#include "ir_visitor.h"
> +#include "ir_rvalue_visitor.h"
> +#include "glsl_types.h"
> +
> +namespace {
> +class lower_const_array_visitor : public ir_rvalue_visitor {
> +public:
> +   lower_const_array_visitor(exec_list *insts)
> +   {
> +      instructions = insts;
> +      progress = false;
> +   }
> +
> +   bool run()
> +   {
> +      visit_list_elements(this, instructions);
> +      return progress;
> +   }
> +
> +   void handle_rvalue(ir_rvalue **rvalue);
> +
> +private:
> +   exec_list *instructions;
> +   bool progress;
> +};
> +
> +void
> +lower_const_array_visitor::handle_rvalue(ir_rvalue **rvalue)
> +{
> +   if (!*rvalue)
> +      return;
> +
> +   ir_constant *con = (*rvalue)->as_constant();
> +   if (!con || !con->type->is_array())
> +      return;
> +
> +   void *mem_ctx = ralloc_parent(con);
> +
> +   ir_variable *uni =
> +      new(mem_ctx) ir_variable(con->type, "constarray", ir_var_uniform);
> +   uni->constant_initializer = con;
> +   uni->constant_value = con;
> +   uni->data.has_initializer = true;
> +   uni->data.read_only = true;
> +   /* Assume the whole thing is accessed. */
> +   uni->data.max_array_access = uni->type->length - 1;

I'm not sure this is necessary.  If the whole thing isn't accessed, we
should trim it.  If it's accessed with a non-constant index,
max_array_access should get properly set anyway.  Since the length
method has already been processed, we shouldn't have to worry about that.

> +   instructions->push_head(uni);
> +
> +   *rvalue = new(mem_ctx) ir_dereference_variable(uni);
> +
> +   progress = true;
> +}
> +
> +} /* anonymous namespace */
> +
> +bool
> +lower_const_arrays_to_uniforms(exec_list *instructions)
> +{
> +   lower_const_array_visitor v(instructions);
> +   return v.run();
> +}
> 



More information about the mesa-dev mailing list