[Mesa-dev] [PATCH] glsl: Lower constant arrays to uniform arrays.

Marek Olšák maraeo at gmail.com
Thu Oct 30 05:47:42 PDT 2014


I would think that the LLVM backend should be able recognize that it's
spilling an array of literals.

Marek

On Thu, Oct 30, 2014 at 1:43 PM, Marek Olšák <maraeo at gmail.com> wrote:
> Hi Tom,
>
> Could you please elaborate on how we should change the TGSI->LLVM translation?
>
> Marek
>
> On Fri, Oct 17, 2014 at 4:49 PM, Tom Stellard <tom at stellard.net> wrote:
>> On Wed, Oct 15, 2014 at 05:32:11PM -0700, Kenneth Graunke wrote:
>>> Consider GLSL code such as:
>>>
>>>    const ivec2 offsets[] =
>>>       ivec2[](ivec2(-1, -1), ivec2(-1, 0), ivec2(-1, 1),
>>>               ivec2(0, -1),  ivec2(0, 0),  ivec2(0, 1),
>>>               ivec2(1, -1),  ivec2(1, 0),  ivec2(1, 1));
>>>
>>>    ivec2 offset = offsets[<non-constant expression>];
>>>
>>
>> I don't think we necessarily want this pass for radeonsi.  We produce
>> efficient code for these kinds of arrays in OpenCL, I think we just need to
>> change how we translate TGSI to LLVM.
>>
>> -Tom
>>
>>> Both i965 and nv50 currently handle this very poorly.  On i965, this
>>> becomes a pile of MOVs to load the immediate constants into registers,
>>> a pile of scratch writes to move the whole array to memory, and one
>>> scratch read to actually access the value - effectively the same as if
>>> it were a non-constant array.
>>>
>>> We'd much rather upload large blocks of constant data as uniform data,
>>> so drivers can simply upload the data via constbufs, and not have to
>>> populate it via shader instructions.
>>>
>>> This is currently non-optional because both i965 and nouveau benefit
>>> from it - we can revisit that if another driver actually benefits.
>>>
>>> Improves performance in a terrain rendering microbenchmark by about 2x,
>>> and cuts the number of instructions in about half.  Helps a lot of
>>> "Natural Selection 2" shaders, as well as one "HOARD" shader.
>>>
>>> total instructions in shared programs: 5473459 -> 5471765 (-0.03%)
>>> instructions in affected programs:     5880 -> 4186 (-28.81%)
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77957
>>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>>>
>>> Reviewers: suggestions for better max_array_access handling are welcome.
>>> Reviewers: This changes the const-ness of the expression.  Is that a problem?
>>> ---
>>>  src/glsl/Makefile.sources                   |   1 +
>>>  src/glsl/ir_optimization.h                  |   1 +
>>>  src/glsl/linker.cpp                         |   2 +
>>>  src/glsl/lower_const_arrays_to_uniforms.cpp | 101 ++++++++++++++++++++++++++++
>>>  4 files changed, 105 insertions(+)
>>>  create mode 100644 src/glsl/lower_const_arrays_to_uniforms.cpp
>>>
>>> I've had this patch sitting around since April, and been pondering whether
>>> we should improve it somehow.  But...it helps certain shaders a ton, and I
>>> haven't seen anything hurt by it.  So I'm wondering if we should just land
>>> it; we can always improve things later.
>>>
>>> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
>>> index 0c55327..6aed52d 100644
>>> --- a/src/glsl/Makefile.sources
>>> +++ b/src/glsl/Makefile.sources
>>> @@ -58,6 +58,7 @@ LIBGLSL_FILES = \
>>>       $(GLSL_SRCDIR)/loop_analysis.cpp \
>>>       $(GLSL_SRCDIR)/loop_controls.cpp \
>>>       $(GLSL_SRCDIR)/loop_unroll.cpp \
>>> +     $(GLSL_SRCDIR)/lower_const_arrays_to_uniforms.cpp \
>>>       $(GLSL_SRCDIR)/lower_clip_distance.cpp \
>>>       $(GLSL_SRCDIR)/lower_discard.cpp \
>>>       $(GLSL_SRCDIR)/lower_discard_flow.cpp \
>>> diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
>>> index e25857a..34e0b4b 100644
>>> --- a/src/glsl/ir_optimization.h
>>> +++ b/src/glsl/ir_optimization.h
>>> @@ -114,6 +114,7 @@ bool lower_noise(exec_list *instructions);
>>>  bool lower_variable_index_to_cond_assign(exec_list *instructions,
>>>      bool lower_input, bool lower_output, bool lower_temp, bool lower_uniform);
>>>  bool lower_quadop_vector(exec_list *instructions, bool dont_lower_swz);
>>> +bool lower_const_arrays_to_uniforms(exec_list *instructions);
>>>  bool lower_clip_distance(gl_shader *shader);
>>>  void lower_output_reads(exec_list *instructions);
>>>  bool lower_packing_builtins(exec_list *instructions, int op_mask);
>>> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
>>> index 47a722d..2a69b78 100644
>>> --- a/src/glsl/linker.cpp
>>> +++ b/src/glsl/linker.cpp
>>> @@ -2692,6 +2692,8 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog)
>>>                                      &ctx->Const.ShaderCompilerOptions[i],
>>>                                      ctx->Const.NativeIntegers))
>>>        ;
>>> +
>>> +      lower_const_arrays_to_uniforms(prog->_LinkedShaders[i]->ir);
>>>     }
>>>
>>>     /* Check and validate stream emissions in geometry shaders */
>>> diff --git a/src/glsl/lower_const_arrays_to_uniforms.cpp b/src/glsl/lower_const_arrays_to_uniforms.cpp
>>> new file mode 100644
>>> index 0000000..2085086
>>> --- /dev/null
>>> +++ b/src/glsl/lower_const_arrays_to_uniforms.cpp
>>> @@ -0,0 +1,101 @@
>>> +/*
>>> + * Copyright © 2014 Intel Corporation
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person obtaining a
>>> + * copy of this software and associated documentation files (the "Software"),
>>> + * to deal in the Software without restriction, including without limitation
>>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>>> + * and/or sell copies of the Software, and to permit persons to whom the
>>> + * Software is furnished to do so, subject to the following conditions:
>>> + *
>>> + * The above copyright notice and this permission notice (including the next
>>> + * paragraph) shall be included in all copies or substantial portions of the
>>> + * Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>>> + * DEALINGS IN THE SOFTWARE.
>>> + */
>>> +
>>> +/**
>>> + * \file lower_const_arrays_to_uniforms.cpp
>>> + *
>>> + * Lower constant arrays to uniform arrays.
>>> + *
>>> + * Some driver backends (such as i965 and nouveau) don't handle constant arrays
>>> + * gracefully, instead treating them as ordinary writable temporary arrays.
>>> + * Since arrays can be large, this often means spilling them to scratch memory,
>>> + * which usually involves a large number of instructions.
>>> + *
>>> + * This must be called prior to link_set_uniform_initializers(); we need the
>>> + * linker to process our new uniform's constant initializer.
>>> + *
>>> + * This should be called after optimizations, since those can result in
>>> + * splitting and removing arrays that are indexed by constant expressions.
>>> + */
>>> +#include "ir.h"
>>> +#include "ir_visitor.h"
>>> +#include "ir_rvalue_visitor.h"
>>> +#include "glsl_types.h"
>>> +
>>> +namespace {
>>> +class lower_const_array_visitor : public ir_rvalue_visitor {
>>> +public:
>>> +   lower_const_array_visitor(exec_list *insts)
>>> +   {
>>> +      instructions = insts;
>>> +      progress = false;
>>> +   }
>>> +
>>> +   bool run()
>>> +   {
>>> +      visit_list_elements(this, instructions);
>>> +      return progress;
>>> +   }
>>> +
>>> +   void handle_rvalue(ir_rvalue **rvalue);
>>> +
>>> +private:
>>> +   exec_list *instructions;
>>> +   bool progress;
>>> +};
>>> +
>>> +void
>>> +lower_const_array_visitor::handle_rvalue(ir_rvalue **rvalue)
>>> +{
>>> +   if (!*rvalue)
>>> +      return;
>>> +
>>> +   ir_constant *con = (*rvalue)->as_constant();
>>> +   if (!con || !con->type->is_array())
>>> +      return;
>>> +
>>> +   void *mem_ctx = ralloc_parent(con);
>>> +
>>> +   ir_variable *uni =
>>> +      new(mem_ctx) ir_variable(con->type, "constarray", ir_var_uniform);
>>> +   uni->constant_initializer = con;
>>> +   uni->constant_value = con;
>>> +   uni->data.has_initializer = true;
>>> +   uni->data.read_only = true;
>>> +   /* Assume the whole thing is accessed. */
>>> +   uni->data.max_array_access = uni->type->length - 1;
>>> +   instructions->push_head(uni);
>>> +
>>> +   *rvalue = new(mem_ctx) ir_dereference_variable(uni);
>>> +
>>> +   progress = true;
>>> +}
>>> +
>>> +} /* anonymous namespace */
>>> +
>>> +bool
>>> +lower_const_arrays_to_uniforms(exec_list *instructions)
>>> +{
>>> +   lower_const_array_visitor v(instructions);
>>> +   return v.run();
>>> +}
>>> --
>>> 2.1.2
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list