[Mesa-dev] [PATCH v2 6/6] glsl: Ignore loop-too-large heuristic if there's bad variable indexing.
Kenneth Graunke
kenneth at whitecape.org
Wed Apr 9 08:31:51 PDT 2014
On 04/08/2014 09:20 PM, Kenneth Graunke wrote:
> Many shaders use a pattern such as:
>
> for (int i = 0; i < NUM_LIGHTS; i++) {
> ...access a uniform array, or shader input/output array...
> }
>
> where NUM_LIGHTS is a small constant (such as 2, 4, or 8).
>
> The expectation is that the compiler will unroll those loops, turning
> the array access into constant indexing, which is more efficient, and
> which may enable array splitting and other optimizations.
>
> In many cases, our heuristic fails - either there's another tiny nested
> loop inside, or the estimated number of instructions is just barely
> beyond the threshold. So, we fail to unroll the loop, leaving the
> variable indexing in place.
>
> Drivers which don't support the particular flavor of variable indexing
> will call lower_variable_index_to_cond_assign(), which generates piles
> and piles of immensely inefficient code. We'd like to avoid generating
> that.
>
> This patch detects unsupported forms of variable-indexing in loops, where
> the array index is a loop induction variable. In that case, it bypasses
> the loop-too-large heuristic and forces unrolling.
>
> Improves performance in a PCF soft-shadow microbenchmark by 2x.
Sorry...this number is incorrect. It improves performance by 21%.
The 2x figure was due to a bug in the older version where I literally
unrolled everything, which also got rid of variable-indexing of uniform
arrays. (The program still worked even with the buggy patch...we just
made different unrolling decisions. So, we should still be able to
attain 2x...just not with this patch alone.)
> No changes in shader-db.
>
> v2: Check ir->array for being an array or matrix, rather than the
> ir_dereference_array itself.
>
> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> ---
> src/glsl/loop_unroll.cpp | 61 +++++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 58 insertions(+), 3 deletions(-)
>
> v1 of 6/6 had several bugs, which ended up cancelling out in many cases
> and making things look like they were working. I think this one is
> actually good. Sorry for the noise...
>
> diff --git a/src/glsl/loop_unroll.cpp b/src/glsl/loop_unroll.cpp
> index 1ce4d58..da53280 100644
> --- a/src/glsl/loop_unroll.cpp
> +++ b/src/glsl/loop_unroll.cpp
> @@ -63,13 +63,17 @@ is_break(ir_instruction *ir)
> class loop_unroll_count : public ir_hierarchical_visitor {
> public:
> int nodes;
> + bool unsupported_variable_indexing;
> /* If there are nested loops, the node count will be inaccurate. */
> bool nested_loop;
>
> - loop_unroll_count(exec_list *list)
> + loop_unroll_count(exec_list *list, loop_variable_state *ls,
> + const struct gl_shader_compiler_options *options)
> + : ls(ls), options(options)
> {
> nodes = 0;
> nested_loop = false;
> + unsupported_variable_indexing = false;
>
> run(list);
> }
> @@ -91,6 +95,54 @@ public:
> nested_loop = true;
> return visit_continue;
> }
> +
> + virtual ir_visitor_status visit_enter(ir_dereference_array *ir)
> + {
> + /* Check for arrays variably-indexed by a loop induction variable.
> + * Unrolling the loop may convert that access into constant-indexing.
> + *
> + * Many drivers don't support particular kinds of variable indexing,
> + * and have to resort to using lower_variable_index_to_cond_assign to
> + * handle it. This results in huge amounts of horrible code, so we'd
> + * like to avoid that if possible. Here, we just note that it will
> + * happen.
> + */
> + if ((ir->array->type->is_array() || ir->array->type->is_matrix()) &&
> + !ir->array_index->as_constant()) {
> + ir_variable *array = ir->array->variable_referenced();
> + loop_variable *lv = ls->get(ir->array_index->variable_referenced());
> + if (array && lv && lv->is_induction_var()) {
> + switch (array->data.mode) {
> + case ir_var_auto:
> + case ir_var_temporary:
> + case ir_var_const_in:
> + case ir_var_function_in:
> + case ir_var_function_out:
> + case ir_var_function_inout:
> + if (options->EmitNoIndirectTemp)
> + unsupported_variable_indexing = true;
> + break;
> + case ir_var_uniform:
> + if (options->EmitNoIndirectUniform)
> + unsupported_variable_indexing = true;
> + break;
> + case ir_var_shader_in:
> + if (options->EmitNoIndirectInput)
> + unsupported_variable_indexing = true;
> + break;
> + case ir_var_shader_out:
> + if (options->EmitNoIndirectOutput)
> + unsupported_variable_indexing = true;
> + break;
> + }
> + }
> + }
> + return visit_continue;
> + }
> +
> +private:
> + loop_variable_state *ls;
> + const struct gl_shader_compiler_options *options;
> };
>
>
> @@ -257,9 +309,12 @@ loop_unroll_visitor::visit_leave(ir_loop *ir)
>
> /* Don't try to unroll nested loops and loops with a huge body.
> */
> - loop_unroll_count count(&ir->body_instructions);
> + loop_unroll_count count(&ir->body_instructions, ls, options);
> +
> + bool loop_too_large =
> + count.nested_loop || count.nodes * iterations > max_iterations * 5;
>
> - if (count.nested_loop || count.nodes * iterations > max_iterations * 5)
> + if (loop_too_large && !count.unsupported_variable_indexing)
> return visit_continue;
>
> /* Note: the limiting terminator contributes 1 to ls->num_loop_jumps.
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140409/83c2ac8d/attachment.sig>
More information about the mesa-dev
mailing list