[Mesa-dev] [PATCH] glsl: compute lvalues of [in]out parameters before inlined function body

Wed Oct 19 21:18:47 UTC 2016

On 10/19/2016 11:45 AM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle <nicolai.haehnle at amd.com>
> 
> This is required when an out argument involves an array index that is either
> a global variable modified by the function or another out argument in the
> same function call.
> 
> Fixes the shaders/out-parameter-indexing/vs-inout-index-inout-* tests.
> --
> I'd appreciate if somebody could clarify how much the IR has to be a tree,
> and to what extent it is allowed to be a DAG. In particular, for inout
> parameters we have the same dereference sub-expression twice. As is, the
> patch clones this sub-expression (where the code for passing an inout
> parameter into the function is added). That may be unnecessary, but I'm
> not sure and I wanted to stay on the safe side.

The only things that can ever be reachable, through any means, more than
once in the IR is an ir_variable (via some sort of ir_dereference) or an
ir_function_signature (via ir_call).  This is one of the most important
invariants that ir_validate checks.

This is why lots of places (lowering passes, optimizations, etc.) will
pull some expression from a tree, assign it to a new variable, and use
the new variable instead.

> Thanks,
> Nicolai
> ---
>  src/compiler/glsl/opt_function_inlining.cpp | 122 ++++++++++++++++++++++++----
>  1 file changed, 108 insertions(+), 14 deletions(-)
> 
> diff --git a/src/compiler/glsl/opt_function_inlining.cpp b/src/compiler/glsl/opt_function_inlining.cpp
> index 83534bf..eee8fd7 100644
> --- a/src/compiler/glsl/opt_function_inlining.cpp
> +++ b/src/compiler/glsl/opt_function_inlining.cpp
> @@ -88,28 +88,87 @@ replace_return_with_assignment(ir_instruction *ir, void *data)
>        } else {
>  	 /* un-valued return has to be the last return, or we shouldn't
>  	  * have reached here. (see can_inline()).
>  	  */
>  	 assert(ret->next->is_tail_sentinel());
>  	 ret->remove();
>        }
>     }
>  }
>  
> +/* Save the given lvalue before the given instruction.
> + *
> + * This is done by adding temporary variables into which the current value
> + * of any array indices are saved, and then reconstructing a dereference chain
> + * that uses those temporary variables.
> + *
> + * This function returns the original lvalue if there were no array indices
> + * to save.
> + */
> +static ir_rvalue *
> +save_lvalue(void *ctx, ir_rvalue *lvalue, ir_instruction *insert_before)
> +{

This should use an ir_hierarchical_visitor.  There's a pretty
significant bug at the bottom, and using the visitor will fix that.

> +   if (ir_dereference_record *deref_record = lvalue->as_dereference_record()) {
> +      ir_rvalue *saved_record;
> +
> +      saved_record = save_lvalue(ctx, deref_record->record, insert_before);
> +      if (saved_record == deref_record->record)
> +         return lvalue;
> +
> +      return new(ctx) ir_dereference_record(saved_record, deref_record->field);

This could be simplified as:

         lvalue->record = save_lvalue(ctx, deref_record->record,
insert_before);
         return lvalue;

> +   }
> +
> +   if (ir_swizzle *swizzle = lvalue->as_swizzle()) {
> +      ir_rvalue *saved_val;
> +
> +      saved_val = save_lvalue(ctx, swizzle->val, insert_before);
> +      if (saved_val == swizzle->val)
> +         return lvalue;
> +
> +      return new(ctx) ir_swizzle(saved_val, swizzle->mask);

This could be simplified as:

         lvalue->val = save_lvalue(ctx, swizzle->val, insert_before);
         return lvalue;

> +   }
> +
> +   if (ir_dereference_array *deref_array = lvalue->as_dereference_array()) {
> +      ir_rvalue *saved_array;
> +      ir_variable *index;
> +      ir_assignment *assignment;
> +
> +      saved_array = save_lvalue(ctx, deref_array->array, insert_before);
> +      if (saved_array == deref_array->array &&
> +          deref_array->array_index->ir_type == ir_type_constant)
> +         return lvalue;
> +
> +      index = new(ctx) ir_variable(deref_array->array_index->type, "saved_idx", ir_var_temporary);
> +      insert_before->insert_before(index);
> +
> +      assignment = new(ctx) ir_assignment(new(ctx) ir_dereference_variable(index),
> +                                          deref_array->array_index, 0);
> +      insert_before->insert_before(assignment);
> +
> +      return new(ctx) ir_dereference_array(saved_array,
> +                                           new(ctx) ir_dereference_variable(index));
> +   }
> +
> +   assert(lvalue->ir_type == ir_type_dereference_variable);

I think the ir_dereference_array and ir_dereference_variable handling is
not quite the right approach.  Basically you want to replace any
expression involving x with an identical expression involving in new
variable y and an assignment of x to y.  Right?  I think your
implementation won't catch cases like

    foo(a[x+y], x, y);

I think the easiest thing to do is make an ir_hierarchical visitor that
replaces the var of every ir_dereference_variable see inside the
array_index of an ir_dereference_array with a new variable and an
assignment.  Just increment / decrement a count in
ir_visit_enter(ir_dereference_array*) and
ir_visit_leave(ir_dereference_array*).  If count is non-zero in
ir_visit(ir_dereference_variable*), do the replacement.

> +
> +   return lvalue;
> +}
> +
>  void
>  ir_call::generate_inline(ir_instruction *next_ir)
>  {
>     void *ctx = ralloc_parent(this);
>     ir_variable **parameters;
>     unsigned num_parameters;
>     int i;
>     struct hash_table *ht;
> +   exec_list out_lvalues;
>  
>     ht = _mesa_hash_table_create(NULL, _mesa_hash_pointer, _mesa_key_pointer_equal);
>  
>     num_parameters = this->callee->parameters.length();
>     parameters = new ir_variable *[num_parameters];
>  
>     /* Generate the declarations for the parameters to our inlined code,
>      * and set up the mapping of real function body variables to ours.
>      */
>     i = 0;
> @@ -132,29 +191,66 @@ ir_call::generate_inline(ir_instruction *next_ir)
>  
>  	 /* Remove the read-only decoration because we're going to write
>  	  * directly to this variable.  If the cloned variable is left
>  	  * read-only and the inlined function is inside a loop, the loop
>  	  * analysis code will get confused.
>  	  */
>  	 parameters[i]->data.read_only = false;
>  	 next_ir->insert_before(parameters[i]);
>        }
>  
> -      /* Move the actual param into our param variable if it's an 'in' type. */
> -      if (parameters[i] && (sig_param->data.mode == ir_var_function_in ||
> -			    sig_param->data.mode == ir_var_const_in ||
> -			    sig_param->data.mode == ir_var_function_inout)) {
> -	 ir_assignment *assign;
> -
> -	 assign = new(ctx) ir_assignment(new(ctx) ir_dereference_variable(parameters[i]),
> -					 param, NULL);
> -	 next_ir->insert_before(assign);
> +      /* Section 6.1.1 (Function Calling Conventions) of the OpenGL Shading
> +       * Language 4.5 spec says:
> +       *
> +       *    "All arguments are evaluated at call time, exactly once, in order,
> +       *     from left to right. [...] Evaluation of an out parameter results
> +       *     in an l-value that is used to copy out a value when the function
> +       *     returns."
> +       *
> +       * I.e., we have to take temporary copies of any relevant array indices
> +       * before the function body is executed.
> +       *
> +       * This ensures that
> +       * (a) if an array index expressions refers to a variable that is
> +       *     modified by the execution of the function body, we use the
> +       *     original value as intended, and
> +       * (b) if an array index expression has side effects, those side effects
> +       *     are only executed once and at the right time.
> +       */
> +      if (parameters[i]) {
> +         if (sig_param->data.mode == ir_var_function_in ||
> +             sig_param->data.mode == ir_var_const_in) {
> +            ir_assignment *assign;
> +
> +            assign = new(ctx) ir_assignment(new(ctx) ir_dereference_variable(parameters[i]),
> +                                            param, NULL);
> +            next_ir->insert_before(assign);
> +         } else {
> +            assert(sig_param->data.mode == ir_var_function_out ||
> +                   sig_param->data.mode == ir_var_function_inout);
> +            assert(param->is_lvalue());
> +
> +            ir_rvalue *lvalue;
> +
> +            lvalue = save_lvalue(ctx, param, next_ir);
> +
> +            if (sig_param->data.mode == ir_var_function_inout) {
> +               ir_assignment *assign;
> +
> +               assign = new(ctx) ir_assignment(new(ctx) ir_dereference_variable(parameters[i]),
> +                                               lvalue->clone(ctx, NULL)->as_rvalue(), NULL);
> +
> +               next_ir->insert_before(assign);
> +            }
> +
> +            out_lvalues.push_tail(lvalue);
> +         }
>        }
>  
>        ++i;
>     }
>  
>     exec_list new_instructions;
>  
>     /* Generate the inlined body of the function to a new list */
>     foreach_in_list(ir_instruction, ir, &callee->body) {
>        ir_instruction *new_ir = ir->clone(ctx, ht);
> @@ -179,31 +275,29 @@ ir_call::generate_inline(ir_instruction *next_ir)
>        }
>     }
>  
>     /* Now push those new instructions in. */
>     next_ir->insert_before(&new_instructions);
>  
>     /* Copy back the value of any 'out' parameters from the function body
>      * variables to our own.
>      */
>     i = 0;
> -   foreach_two_lists(formal_node, &this->callee->parameters,
> -                     actual_node, &this->actual_parameters) {
> -      ir_rvalue *const param = (ir_rvalue *) actual_node;
> -      const ir_variable *const sig_param = (ir_variable *) formal_node;
> +   foreach_in_list(ir_variable, sig_param, &this->callee->parameters) {
>  
>        /* Move our param variable into the actual param if it's an 'out' type. */
>        if (parameters[i] && (sig_param->data.mode == ir_var_function_out ||
>  			    sig_param->data.mode == ir_var_function_inout)) {
>  	 ir_assignment *assign;
>  
> -	 assign = new(ctx) ir_assignment(param->clone(ctx, NULL)->as_rvalue(),
> +         ir_rvalue *lvalue = (ir_rvalue *)out_lvalues.pop_head();
> +         assign = new(ctx) ir_assignment(lvalue,
>  					 new(ctx) ir_dereference_variable(parameters[i]),
>  					 NULL);
>  	 next_ir->insert_before(assign);
>        }
>  
>        ++i;
>     }
>  
>     delete [] parameters;
>  
>