[Mesa-dev] [PATCH] glsl: compute lvalues of [in]out parameters before inlined function body

Thu Oct 20 09:25:42 UTC 2016

On 20.10.2016 09:58, Nicolai Hähnle wrote:
> On 19.10.2016 23:18, Ian Romanick wrote:
>> On 10/19/2016 11:45 AM, Nicolai Hähnle wrote:
>>> From: Nicolai Hähnle <nicolai.haehnle at amd.com>
>>>
>>> This is required when an out argument involves an array index that is
>>> either
>>> a global variable modified by the function or another out argument in
>>> the
>>> same function call.
>>>
>>> Fixes the shaders/out-parameter-indexing/vs-inout-index-inout-* tests.
>>> --
>>> I'd appreciate if somebody could clarify how much the IR has to be a
>>> tree,
>>> and to what extent it is allowed to be a DAG. In particular, for inout
>>> parameters we have the same dereference sub-expression twice. As is, the
>>> patch clones this sub-expression (where the code for passing an inout
>>> parameter into the function is added). That may be unnecessary, but I'm
>>> not sure and I wanted to stay on the safe side.
>>
>> The only things that can ever be reachable, through any means, more than
>> once in the IR is an ir_variable (via some sort of ir_dereference) or an
>> ir_function_signature (via ir_call).  This is one of the most important
>> invariants that ir_validate checks.
>>
>> This is why lots of places (lowering passes, optimizations, etc.) will
>> pull some expression from a tree, assign it to a new variable, and use
>> the new variable instead.
>
> Okay, thanks for the clarification.
>
>
>>
>>> Thanks,
>>> Nicolai
>>> ---
>>>  src/compiler/glsl/opt_function_inlining.cpp | 122
>>> ++++++++++++++++++++++++----
>>>  1 file changed, 108 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/src/compiler/glsl/opt_function_inlining.cpp
>>> b/src/compiler/glsl/opt_function_inlining.cpp
>>> index 83534bf..eee8fd7 100644
>>> --- a/src/compiler/glsl/opt_function_inlining.cpp
>>> +++ b/src/compiler/glsl/opt_function_inlining.cpp
>>> @@ -88,28 +88,87 @@ replace_return_with_assignment(ir_instruction
>>> *ir, void *data)
>>>        } else {
>>>       /* un-valued return has to be the last return, or we shouldn't
>>>        * have reached here. (see can_inline()).
>>>        */
>>>       assert(ret->next->is_tail_sentinel());
>>>       ret->remove();
>>>        }
>>>     }
>>>  }
>>>
>>> +/* Save the given lvalue before the given instruction.
>>> + *
>>> + * This is done by adding temporary variables into which the current
>>> value
>>> + * of any array indices are saved, and then reconstructing a
>>> dereference chain
>>> + * that uses those temporary variables.
>>> + *
>>> + * This function returns the original lvalue if there were no array
>>> indices
>>> + * to save.
>>> + */
>>> +static ir_rvalue *
>>> +save_lvalue(void *ctx, ir_rvalue *lvalue, ir_instruction
>>> *insert_before)
>>> +{
>>
>> This should use an ir_hierarchical_visitor.  There's a pretty
>> significant bug at the bottom, and using the visitor will fix that.
>
> I considered that, but I don't think the hierarchical visitor makes
> sense. An ir_visitor perhaps, but anyway this should be discussed below...
>
>
>>
>>> +   if (ir_dereference_record *deref_record =
>>> lvalue->as_dereference_record()) {
>>> +      ir_rvalue *saved_record;
>>> +
>>> +      saved_record = save_lvalue(ctx, deref_record->record,
>>> insert_before);
>>> +      if (saved_record == deref_record->record)
>>> +         return lvalue;
>>> +
>>> +      return new(ctx) ir_dereference_record(saved_record,
>>> deref_record->field);
>>
>> This could be simplified as:
>>
>>          lvalue->record = save_lvalue(ctx, deref_record->record,
>> insert_before);
>>          return lvalue;
>
> Will do.
>
>
>>> +   }
>>> +
>>> +   if (ir_swizzle *swizzle = lvalue->as_swizzle()) {
>>> +      ir_rvalue *saved_val;
>>> +
>>> +      saved_val = save_lvalue(ctx, swizzle->val, insert_before);
>>> +      if (saved_val == swizzle->val)
>>> +         return lvalue;
>>> +
>>> +      return new(ctx) ir_swizzle(saved_val, swizzle->mask);
>>
>> This could be simplified as:
>>
>>          lvalue->val = save_lvalue(ctx, swizzle->val, insert_before);
>>          return lvalue;
>
> Will do.
>
>
>>> +   }
>>> +
>>> +   if (ir_dereference_array *deref_array =
>>> lvalue->as_dereference_array()) {
>>> +      ir_rvalue *saved_array;
>>> +      ir_variable *index;
>>> +      ir_assignment *assignment;
>>> +
>>> +      saved_array = save_lvalue(ctx, deref_array->array,
>>> insert_before);
>>> +      if (saved_array == deref_array->array &&
>>> +          deref_array->array_index->ir_type == ir_type_constant)
>>> +         return lvalue;
>>> +
>>> +      index = new(ctx) ir_variable(deref_array->array_index->type,
>>> "saved_idx", ir_var_temporary);
>>> +      insert_before->insert_before(index);
>>> +
>>> +      assignment = new(ctx) ir_assignment(new(ctx)
>>> ir_dereference_variable(index),
>>> +                                          deref_array->array_index, 0);
>>> +      insert_before->insert_before(assignment);
>>> +
>>> +      return new(ctx) ir_dereference_array(saved_array,
>>> +                                           new(ctx)
>>> ir_dereference_variable(index));
>>> +   }
>>> +
>>> +   assert(lvalue->ir_type == ir_type_dereference_variable);
>>
>> I think the ir_dereference_array and ir_dereference_variable handling is
>> not quite the right approach.  Basically you want to replace any
>> expression involving x with an identical expression involving in new
>> variable y and an assignment of x to y.  Right?  I think your
>> implementation won't catch cases like
>>
>>     foo(a[x+y], x, y);
>
> Well, that should be semantically equivalent to
>
>     tmp = x+y;
>     foo(a[tmp], x, y);
>
> which is precisely what this code does, isn't it?
>
> Also, consider more complex examples like:
>
>     foo(a[bar(...)]);
>
> which should behave like
>
>     tmp = bar(...);
>     foo(a[tmp]);
>
> Replacing variables isn't sufficient there. Part of what this patch does
> is ensure that bar is called (a) before the call to foo instead of
> afterwards, and (b) only once even when it's an inout-parameter.

Okay, I realize now that ir_expressions apparently cannot have side 
effects, so that particular part was okay before. So my approach differs 
from what you said simply by taking a copy of the entire array_index 
expression in one go, instead of taking a copy of all variables that 
appear inside it.

Nicolai

>
> Maybe there's something else I missed, but I still believe the current
> approach is right. About using an ir_hierarchical_visitor, I personally
> feel that it would obfuscate what's going on because I actually _don't_
> want to visit everything in the tree recursively.

>
> Thanks,
> Nicolai
>
>> I think the easiest thing to do is make an ir_hierarchical visitor that
>> replaces the var of every ir_dereference_variable see inside the
>> array_index of an ir_dereference_array with a new variable and an
>> assignment.  Just increment / decrement a count in
>> ir_visit_enter(ir_dereference_array*) and
>> ir_visit_leave(ir_dereference_array*).  If count is non-zero in
>> ir_visit(ir_dereference_variable*), do the replacement.
>