[Mesa-dev] [PATCH] glsl: compute lvalues of [in]out parameters before inlined function body

Nicolai Hähnle nhaehnle at gmail.com
Thu Oct 20 07:58:24 UTC 2016


On 19.10.2016 23:18, Ian Romanick wrote:
> On 10/19/2016 11:45 AM, Nicolai Hähnle wrote:
>> From: Nicolai Hähnle <nicolai.haehnle at amd.com>
>>
>> This is required when an out argument involves an array index that is either
>> a global variable modified by the function or another out argument in the
>> same function call.
>>
>> Fixes the shaders/out-parameter-indexing/vs-inout-index-inout-* tests.
>> --
>> I'd appreciate if somebody could clarify how much the IR has to be a tree,
>> and to what extent it is allowed to be a DAG. In particular, for inout
>> parameters we have the same dereference sub-expression twice. As is, the
>> patch clones this sub-expression (where the code for passing an inout
>> parameter into the function is added). That may be unnecessary, but I'm
>> not sure and I wanted to stay on the safe side.
>
> The only things that can ever be reachable, through any means, more than
> once in the IR is an ir_variable (via some sort of ir_dereference) or an
> ir_function_signature (via ir_call).  This is one of the most important
> invariants that ir_validate checks.
>
> This is why lots of places (lowering passes, optimizations, etc.) will
> pull some expression from a tree, assign it to a new variable, and use
> the new variable instead.

Okay, thanks for the clarification.


>
>> Thanks,
>> Nicolai
>> ---
>>  src/compiler/glsl/opt_function_inlining.cpp | 122 ++++++++++++++++++++++++----
>>  1 file changed, 108 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/compiler/glsl/opt_function_inlining.cpp b/src/compiler/glsl/opt_function_inlining.cpp
>> index 83534bf..eee8fd7 100644
>> --- a/src/compiler/glsl/opt_function_inlining.cpp
>> +++ b/src/compiler/glsl/opt_function_inlining.cpp
>> @@ -88,28 +88,87 @@ replace_return_with_assignment(ir_instruction *ir, void *data)
>>        } else {
>>  	 /* un-valued return has to be the last return, or we shouldn't
>>  	  * have reached here. (see can_inline()).
>>  	  */
>>  	 assert(ret->next->is_tail_sentinel());
>>  	 ret->remove();
>>        }
>>     }
>>  }
>>
>> +/* Save the given lvalue before the given instruction.
>> + *
>> + * This is done by adding temporary variables into which the current value
>> + * of any array indices are saved, and then reconstructing a dereference chain
>> + * that uses those temporary variables.
>> + *
>> + * This function returns the original lvalue if there were no array indices
>> + * to save.
>> + */
>> +static ir_rvalue *
>> +save_lvalue(void *ctx, ir_rvalue *lvalue, ir_instruction *insert_before)
>> +{
>
> This should use an ir_hierarchical_visitor.  There's a pretty
> significant bug at the bottom, and using the visitor will fix that.

I considered that, but I don't think the hierarchical visitor makes 
sense. An ir_visitor perhaps, but anyway this should be discussed below...


>
>> +   if (ir_dereference_record *deref_record = lvalue->as_dereference_record()) {
>> +      ir_rvalue *saved_record;
>> +
>> +      saved_record = save_lvalue(ctx, deref_record->record, insert_before);
>> +      if (saved_record == deref_record->record)
>> +         return lvalue;
>> +
>> +      return new(ctx) ir_dereference_record(saved_record, deref_record->field);
>
> This could be simplified as:
>
>          lvalue->record = save_lvalue(ctx, deref_record->record,
> insert_before);
>          return lvalue;

Will do.


>> +   }
>> +
>> +   if (ir_swizzle *swizzle = lvalue->as_swizzle()) {
>> +      ir_rvalue *saved_val;
>> +
>> +      saved_val = save_lvalue(ctx, swizzle->val, insert_before);
>> +      if (saved_val == swizzle->val)
>> +         return lvalue;
>> +
>> +      return new(ctx) ir_swizzle(saved_val, swizzle->mask);
>
> This could be simplified as:
>
>          lvalue->val = save_lvalue(ctx, swizzle->val, insert_before);
>          return lvalue;

Will do.


>> +   }
>> +
>> +   if (ir_dereference_array *deref_array = lvalue->as_dereference_array()) {
>> +      ir_rvalue *saved_array;
>> +      ir_variable *index;
>> +      ir_assignment *assignment;
>> +
>> +      saved_array = save_lvalue(ctx, deref_array->array, insert_before);
>> +      if (saved_array == deref_array->array &&
>> +          deref_array->array_index->ir_type == ir_type_constant)
>> +         return lvalue;
>> +
>> +      index = new(ctx) ir_variable(deref_array->array_index->type, "saved_idx", ir_var_temporary);
>> +      insert_before->insert_before(index);
>> +
>> +      assignment = new(ctx) ir_assignment(new(ctx) ir_dereference_variable(index),
>> +                                          deref_array->array_index, 0);
>> +      insert_before->insert_before(assignment);
>> +
>> +      return new(ctx) ir_dereference_array(saved_array,
>> +                                           new(ctx) ir_dereference_variable(index));
>> +   }
>> +
>> +   assert(lvalue->ir_type == ir_type_dereference_variable);
>
> I think the ir_dereference_array and ir_dereference_variable handling is
> not quite the right approach.  Basically you want to replace any
> expression involving x with an identical expression involving in new
> variable y and an assignment of x to y.  Right?  I think your
> implementation won't catch cases like
>
>     foo(a[x+y], x, y);

Well, that should be semantically equivalent to

     tmp = x+y;
     foo(a[tmp], x, y);

which is precisely what this code does, isn't it?

Also, consider more complex examples like:

     foo(a[bar(...)]);

which should behave like

     tmp = bar(...);
     foo(a[tmp]);

Replacing variables isn't sufficient there. Part of what this patch does 
is ensure that bar is called (a) before the call to foo instead of 
afterwards, and (b) only once even when it's an inout-parameter.

Maybe there's something else I missed, but I still believe the current 
approach is right. About using an ir_hierarchical_visitor, I personally 
feel that it would obfuscate what's going on because I actually _don't_ 
want to visit everything in the tree recursively.

Thanks,
Nicolai

> I think the easiest thing to do is make an ir_hierarchical visitor that
> replaces the var of every ir_dereference_variable see inside the
> array_index of an ir_dereference_array with a new variable and an
> assignment.  Just increment / decrement a count in
> ir_visit_enter(ir_dereference_array*) and
> ir_visit_leave(ir_dereference_array*).  If count is non-zero in
> ir_visit(ir_dereference_variable*), do the replacement.



More information about the mesa-dev mailing list