[Mesa-dev] [PATCH] glsl: compute lvalues of [in]out parameters before inlined function body
Nicolai Hähnle
nicolai.haehnle at amd.com
Thu Oct 20 09:25:42 UTC 2016
On 20.10.2016 09:58, Nicolai Hähnle wrote:
> On 19.10.2016 23:18, Ian Romanick wrote:
>> On 10/19/2016 11:45 AM, Nicolai Hähnle wrote:
>>> From: Nicolai Hähnle <nicolai.haehnle at amd.com>
>>>
>>> This is required when an out argument involves an array index that is
>>> either
>>> a global variable modified by the function or another out argument in
>>> the
>>> same function call.
>>>
>>> Fixes the shaders/out-parameter-indexing/vs-inout-index-inout-* tests.
>>> --
>>> I'd appreciate if somebody could clarify how much the IR has to be a
>>> tree,
>>> and to what extent it is allowed to be a DAG. In particular, for inout
>>> parameters we have the same dereference sub-expression twice. As is, the
>>> patch clones this sub-expression (where the code for passing an inout
>>> parameter into the function is added). That may be unnecessary, but I'm
>>> not sure and I wanted to stay on the safe side.
>>
>> The only things that can ever be reachable, through any means, more than
>> once in the IR is an ir_variable (via some sort of ir_dereference) or an
>> ir_function_signature (via ir_call). This is one of the most important
>> invariants that ir_validate checks.
>>
>> This is why lots of places (lowering passes, optimizations, etc.) will
>> pull some expression from a tree, assign it to a new variable, and use
>> the new variable instead.
>
> Okay, thanks for the clarification.
>
>
>>
>>> Thanks,
>>> Nicolai
>>> ---
>>> src/compiler/glsl/opt_function_inlining.cpp | 122
>>> ++++++++++++++++++++++++----
>>> 1 file changed, 108 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/src/compiler/glsl/opt_function_inlining.cpp
>>> b/src/compiler/glsl/opt_function_inlining.cpp
>>> index 83534bf..eee8fd7 100644
>>> --- a/src/compiler/glsl/opt_function_inlining.cpp
>>> +++ b/src/compiler/glsl/opt_function_inlining.cpp
>>> @@ -88,28 +88,87 @@ replace_return_with_assignment(ir_instruction
>>> *ir, void *data)
>>> } else {
>>> /* un-valued return has to be the last return, or we shouldn't
>>> * have reached here. (see can_inline()).
>>> */
>>> assert(ret->next->is_tail_sentinel());
>>> ret->remove();
>>> }
>>> }
>>> }
>>>
>>> +/* Save the given lvalue before the given instruction.
>>> + *
>>> + * This is done by adding temporary variables into which the current
>>> value
>>> + * of any array indices are saved, and then reconstructing a
>>> dereference chain
>>> + * that uses those temporary variables.
>>> + *
>>> + * This function returns the original lvalue if there were no array
>>> indices
>>> + * to save.
>>> + */
>>> +static ir_rvalue *
>>> +save_lvalue(void *ctx, ir_rvalue *lvalue, ir_instruction
>>> *insert_before)
>>> +{
>>
>> This should use an ir_hierarchical_visitor. There's a pretty
>> significant bug at the bottom, and using the visitor will fix that.
>
> I considered that, but I don't think the hierarchical visitor makes
> sense. An ir_visitor perhaps, but anyway this should be discussed below...
>
>
>>
>>> + if (ir_dereference_record *deref_record =
>>> lvalue->as_dereference_record()) {
>>> + ir_rvalue *saved_record;
>>> +
>>> + saved_record = save_lvalue(ctx, deref_record->record,
>>> insert_before);
>>> + if (saved_record == deref_record->record)
>>> + return lvalue;
>>> +
>>> + return new(ctx) ir_dereference_record(saved_record,
>>> deref_record->field);
>>
>> This could be simplified as:
>>
>> lvalue->record = save_lvalue(ctx, deref_record->record,
>> insert_before);
>> return lvalue;
>
> Will do.
>
>
>>> + }
>>> +
>>> + if (ir_swizzle *swizzle = lvalue->as_swizzle()) {
>>> + ir_rvalue *saved_val;
>>> +
>>> + saved_val = save_lvalue(ctx, swizzle->val, insert_before);
>>> + if (saved_val == swizzle->val)
>>> + return lvalue;
>>> +
>>> + return new(ctx) ir_swizzle(saved_val, swizzle->mask);
>>
>> This could be simplified as:
>>
>> lvalue->val = save_lvalue(ctx, swizzle->val, insert_before);
>> return lvalue;
>
> Will do.
>
>
>>> + }
>>> +
>>> + if (ir_dereference_array *deref_array =
>>> lvalue->as_dereference_array()) {
>>> + ir_rvalue *saved_array;
>>> + ir_variable *index;
>>> + ir_assignment *assignment;
>>> +
>>> + saved_array = save_lvalue(ctx, deref_array->array,
>>> insert_before);
>>> + if (saved_array == deref_array->array &&
>>> + deref_array->array_index->ir_type == ir_type_constant)
>>> + return lvalue;
>>> +
>>> + index = new(ctx) ir_variable(deref_array->array_index->type,
>>> "saved_idx", ir_var_temporary);
>>> + insert_before->insert_before(index);
>>> +
>>> + assignment = new(ctx) ir_assignment(new(ctx)
>>> ir_dereference_variable(index),
>>> + deref_array->array_index, 0);
>>> + insert_before->insert_before(assignment);
>>> +
>>> + return new(ctx) ir_dereference_array(saved_array,
>>> + new(ctx)
>>> ir_dereference_variable(index));
>>> + }
>>> +
>>> + assert(lvalue->ir_type == ir_type_dereference_variable);
>>
>> I think the ir_dereference_array and ir_dereference_variable handling is
>> not quite the right approach. Basically you want to replace any
>> expression involving x with an identical expression involving in new
>> variable y and an assignment of x to y. Right? I think your
>> implementation won't catch cases like
>>
>> foo(a[x+y], x, y);
>
> Well, that should be semantically equivalent to
>
> tmp = x+y;
> foo(a[tmp], x, y);
>
> which is precisely what this code does, isn't it?
>
> Also, consider more complex examples like:
>
> foo(a[bar(...)]);
>
> which should behave like
>
> tmp = bar(...);
> foo(a[tmp]);
>
> Replacing variables isn't sufficient there. Part of what this patch does
> is ensure that bar is called (a) before the call to foo instead of
> afterwards, and (b) only once even when it's an inout-parameter.
Okay, I realize now that ir_expressions apparently cannot have side
effects, so that particular part was okay before. So my approach differs
from what you said simply by taking a copy of the entire array_index
expression in one go, instead of taking a copy of all variables that
appear inside it.
Nicolai
>
> Maybe there's something else I missed, but I still believe the current
> approach is right. About using an ir_hierarchical_visitor, I personally
> feel that it would obfuscate what's going on because I actually _don't_
> want to visit everything in the tree recursively.
>
> Thanks,
> Nicolai
>
>> I think the easiest thing to do is make an ir_hierarchical visitor that
>> replaces the var of every ir_dereference_variable see inside the
>> array_index of an ir_dereference_array with a new variable and an
>> assignment. Just increment / decrement a count in
>> ir_visit_enter(ir_dereference_array*) and
>> ir_visit_leave(ir_dereference_array*). If count is non-zero in
>> ir_visit(ir_dereference_variable*), do the replacement.
>
More information about the mesa-dev
mailing list