[Mesa-dev] [PATCH v2 2/3] mesa/st: glsl_to_tgsi Implement a new lifetime tracker for temporaries

Nicolai Hähnle nhaehnle at gmail.com
Mon Jun 19 06:19:22 UTC 2017


On 18.06.2017 14:41, Gert Wollny wrote:
> Hello Nicolai,
> 
> 
> Am Sonntag, den 18.06.2017, 12:05 +0200 schrieb Nicolai Hähnle:
>>    
>>>    if HAVE_XLIB_GLX
>>>    SUBDIRS += drivers/x11
>>> @@ -101,7 +101,7 @@ AM_CFLAGS = \
>>>    	$(VISIBILITY_CFLAGS) \
>>>    	$(MSVC2013_COMPAT_CFLAGS)
>>>    AM_CXXFLAGS = \
>>> -	$(LLVM_CFLAGS) \
>>> +        $(LLVM_CXXFLAGS) \
>>
>> I kind of suspect that this might be a no-no. On the one hand it
>> makes sense, because it makes the use of CXXFLAGS consistent, but
>> it's a pretty significant build system change.
> I understand that, but c++11 is just such an improvement.
> 
>>
>> At least separate it out into its own patch.
> Okay,
> 
> 
>>
>>
>>
>>>    	$(VISIBILITY_CXXFLAGS) \
>>>    	$(MSVC2013_COMPAT_CXXFLAGS)
>>>    
>>> diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
>>> index 21f9167bda..a68e9d2afe 100644
>>> --- a/src/mesa/Makefile.sources
>>> +++ b/src/mesa/Makefile.sources
>>> @@ -509,6 +509,8 @@ STATETRACKER_FILES = \
>>>    	state_tracker/st_glsl_to_tgsi.h \
>>>    	state_tracker/st_glsl_to_tgsi_private.cpp \
>>>    	state_tracker/st_glsl_to_tgsi_private.h \
>>> +        state_tracker/st_glsl_to_tgsi_temprename.cpp \
>>> +	state_tracker/st_glsl_to_tgsi_temprename.h \
>>
>> Inconsistent use of whitespace.
>>
>> Then the whole patch needs to be re-arranged, i.e. the
>> st_glsl_to_tgsi_private stuff,
> This was a mistake when I was re-basing the patch.
> 
>> and I agree with Emil that the tests
>> should be separated out into their own patch.
> I'm currently working on this.
> 
>>
>> BTW, you can and should use
>>
>> git rebase -x "cd $builddir && make" $basecommit
>>
>> to verify that you don't break the build in an intermediate patch.
> Nice trick, thanks.
> 
> 
>>> +typedef int scope_idx;
>>
>> Please remove this, it's an unnecessary and distracting abstraction
>> that  doesn't gain you anything.
> Actually, with the refactoring it did the last two days it helped a
> lot.

How? Perhaps your variable names stand to be improved. We try to avoid 
ineffective abstractions in Mesa.


>> I consider the scopes back-reference here and in temp_access to be
>> bad  style.
> I already got rid of them.
> 
>> Think less object- and more data-oriented, then you won't need them.
>> Most of those helper functions should be members of
>> tgsi_temp_lifetime -- and the getters/setters can be removed
>> entirely.  YAGNI.
> I think with the new design that uses pointers into a pre-allocated
> array the methods must stay where they are.

No, they really don't. You're thinking of scopes etc. as objects. If you 
just thought of them as data, as IMO you should, it would feel natural 
to move the methods into the main class and eliminate those back-pointers.


>> To first approximation, you should never use mutable. Those methods
>> should not be const to begin with.
> This is something else I actually didn't like and corrected already.
> 
>>> +scope_idx
>>> +tgsi_temp_lifetime::make_scope(e_scope_type type, int id, int lvl,
>>> +                               int s_begin)const
>>> +{
>>> +   int idx = scopes.size();
>>> +   scopes.push_back(prog_scope(type, idx, id, lvl, s_begin,
>>> scopes));
>>> +   return idx;
>>> +}
>>
>> AFAICS this overload is only used once. Please remove it.
> Okay.
> 
>> Use C-style comments.
> Okay,
> 
>> Also, this should probably have an assert, or can this really happen?
> I don't think it can happen, an assert it is.
> 
> 
>> +      case TGSI_OPCODE_IF:
>>> +      case TGSI_OPCODE_UIF:{
>>> +         if (inst->src[0].file == PROGRAM_TEMPORARY) {
>>> +               acc[inst->src[0].index].append(line, acc_read,
>>> current);
>>> +         }
>>> +         scope_idx scope = make_scope(current, sct_if, if_id,
>>> nesting_lvl, line);
>>
>> It's probably a good idea to have scopes start at line + 1.
>> Otherwise,
>> it may be tempting to think that the read access of the IF condition
>> happens inside the IF scope, at least based on the line numbers.
> 
> I don't think that's a problem. At least I can't think of a test case
> that would trigger a problem with that.
> 
> 
>>
>>> +      case TGSI_OPCODE_DEFAULT: {
>>> +         auto scope_type = (inst->op == TGSI_OPCODE_CASE) ?
>>> +                              sct_switch_case :
>>> sct_switch_default;
>>> +         if ( scopes[current].type() == sct_switch ) {
>>
>> No spaces inside parenthesis (also elsewhere).
> Okay.
> 
>>> +            scope_stack.push(current);
>>> +            current = make_scope(current, scope_type,
>>> scopes[current].id(),
>>> +                                 nesting_lvl, line);
>>> +         }else{
>>
>> Spaces around else (also elsewhere).
> Okay.
> 
>>
>> No comment between if- and else-block.
> Okay.
> 
>>
>>
>>> +            scopes[current].set_continue(current, line);
>>
>> You do need to distinguish between break and continue (at least for
>> accuracy), because break allows you to skip over a piece of code
>> indefinitely while continue doesn't. I.e.:
> 
> I risk to differ, in the worst case a CONT can act like it would be a
> BRK, and because I have to handle the worst case it doesn't make sense
> to distinguish between the two.

Do you have an example?


>>
>>      BGNLOOP
>>         ...
>>            BRK/CONT
>>         ...
>>         MOV TEMP[i], ...
>>         ...
>>      ENDLOOP
>>
>> If it's a CONT, then i is written unconditionally inside the loop.
> While one would normally expect this, it is not guarantied. Think of a
> shader like this:
> 
> ...
> varying int a;
> varying float b;
> 
> int f(int a, float b) {...}
> for (int i = 0; i< n; ++i ) {
>     if (a && f(i, b))
>        continue;
>     ...
>     x =
> }
> 
> It may not be the best style. but it is possible.

TGSI loops are always infinite-loops. So actually, every loop should 
have a BRK somewhere. But the point remains that CONT itself cannot skip 
code indefinitely, because there's no breaking out of the loop in BGNLOOP.


>> +
>>> +         for (unsigned j = 0; j < inst->tex_offset_num_offset;
>>> j++) {
>>> +            if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY) {
>>> +               acc[inst->tex_offsets[j].index].append(line,
>>> acc_read, current);
>>> +            }
>>> +         }
>>
>> You need to handle the reads before the writes. After all, you might
>> have a
>>
>>      ADD TEMP[i], TEMP[i], ...
>>
>> which may have to register as an undefined read.
> I will add a test case, and correct the code accordingly.
> 
> 
>>
>>
>>> +
>>> +      } // end default
>>> +      } // end switch (op)
>>
>> No end-of-scope comments.
> okay,
> 
>>>
>>> +scope_idx
>>> +prog_scope::get_parent_loop() const
>>
>> Based on what it does, this should be renamed to something like
>> get_innermost_loop.
> Okay.
> 
>>
>>> +void temp_access::append(int line, e_acc_type acc, scope_idx idx)
>>> +{
>>> +   last_write = line;
>>
>> Looks like last_write should be called last_access.
> 
> Initially I thought the same, but a last read is also (most of the
> time) the last access, and it is handled differently then a write past
> the last read, so i think that last_write catches it better.

So I read the register merging part only afterwards, and it seems to me 
you're allowing to merge two ranges with lifetimes [a,b] and [b,c].

This makes sense if you think of instructions as having two "phases", 
the read-phase which comes before the write-phase, and you think of the 
start of the lifetime as pointing to the write-phase, while the end 
points to the read-phase. With that interpretation, [a,b] and [b,c] are 
genuinely disjoint, and it allows merging TEMP[1] and TEMP[2] in

   ADD TEMP[1], ..., ....
   ADD TEMP[2], TEMP[1], ...

which is good.

I think this all makes sense, but you should be more explicit about it 
and make sure all the variable names are consistent with that convention.

Cheers,
Nicolai


> 
> 
>>> +   if (acc == acc_read) {
>>> +      last_read = line;
>>> +      last_read_scope = idx;
>>> +      if (undefined_read > line) {
>>> +         undefined_read = line;
>>> +         undefined_read_scope = idx;
>>> +      }
>>
>> The way you're using it this is effectively just first_read, not
>> undefined_read.
> 
> Indeed.
> 
>>
>>
>>> +   } else {
>>> +      if (first_write == -1) {
>>> +         first_write = line;
>>> +         first_write_scope = idx;
>>> +
>>> +         // we write in an if-branch
>>> +         auto fw_ifthen_scope = scopes[idx].in_ifelse();
>>> +         if ((fw_ifthen_scope >= 0) &&
>>> scopes[fw_ifthen_scope].in_loop()) {
>>> +            // value not always written, in loops we must keep it
>>> +            keep_for_full_loop = true;
>>> +         } else {
>>> +            // same thing for switch-case
>>> +            auto fw_switch_scope = scopes[idx].in_switchcase();
>>> +            if (fw_switch_scope >= 0 &&
>>> scopes[fw_switch_scope].in_loop()) {
>>> +               keep_for_full_loop = true;
>>> +            }
>>> +         }
>>
>> Simplify this by using an in_conditional() instead of in_ifelse +
>> in_switchcase().
> I was thinking about this, but I also thought that when I want to track
> all code path later then I will have to distinguish the two again, but,
> on the other hand, adding a method that combines the two is a no-
> brainer.
> 
> 
>>
>>> +
>>> +   /* Only written to, just make sure that renaming
>>> +    * doesn't reuse this register too early (corner
>>> +    * case is the one opcode with two destinations) */
>>> +   if (last_read_scope < 0) {
>>> +      return make_pair(first_write, first_write + 1);
>>
>> What if there are multiple writes to the temporary?
> Good catch! Next text case :)
> 
> 
>>> +   /* propagate lifetime also if there was a continue/break
>>> +    * in a loop and the write was after it (so it constitutes
>>> +    * a conditional write */
>>
>> It's only conditional if there was a break. Continue doesn't make it
>> conditional.
> I think I made my argument.
> 
>> +   /* propagate lifetimes before moving to upper scopes */
>>> +   if ((scopes[first_write_scope].type() == sct_loop) &&
>>> +       (keep_for_full_loop || (undefined_read < first_write))) {
>>> +      first_write = scopes[first_write_scope].begin();
>>> +      int lr = scopes[first_write_scope].end();
>>> +      if (last_read < lr)
>>> +         last_read = lr;
>>> +   }
>>
>> What about:
>>
>>      BGNLOOP
>>         ...
>>         BGNLOOP
>>            IF ...
>>               read TEMP[i]
>>            ENDIF
>>            write TEMP[i]
>>         ENDLOOP
>>         ...
>>      ENDLOOP
>>
>> In this case, you don't set keep_for_full_loop, yet the lifetime
>> must extend for the entire _outer_ loop.
> Another test case that needs to be added :)
> 
>>
>> I see two related problems in the code:
>>
>> 1. In this case, keep_for_full_loop needs to be set to true (because
>> overwriting first_write in the code above means that the related
>> check below won't fire).
> 
>>
>> 2. target_level is set to the inner loop even though it really needs
>> to be set to the outer loop.
> Indeed.
> 
> Thanks a lot,
> Gert
> 


-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.


More information about the mesa-dev mailing list