[Mesa-dev] [PATCH 0/3] [RFC] mesa/st: glsl_to_tgsi: improved temp-reg lifetime estimation
Gert Wollny
gw.fossdev at gmail.com
Tue Jun 13 08:01:28 UTC 2017
Am Montag, den 12.06.2017, 21:00 +0200 schrieb Nicolai Hähnle:
Thanks for you comments, although I do not agree with most of them.
>
> > Okay. I think you should seriously re-think your algorithm in a way
> > that makes it a more natural evolution from the algorithm that's
> > already there.
Well, when I look at the current algorithm than I don't really see much
to evolve from: The tracking of loops is minimal and it actually only
uses the outer loop to assign life times.
In any case, 80% of this code I already re-used (i.e. the loop over the
instructions and iterating over the registers).
> > Basically, the current algorithm tracks (first_write, last_read),
> > so think about what you need to track in order to obtain a single-
> > pass algorithm that computes lifetime (first, last) for every
> > temporary. I
IMHO a single pass algorithm is not better then what I do now.
Especially with nested loops a single pass algorithm will be more
complicated.
Think e.g.
...
BEGIN_LOOP
...
BEGIN_LOOP
a = ...
b = ...
c = a OP b;
END_LOOP
... /* more processing that doesn't access a, b, or c */
END_LOOP
out = f(a, c, ...)
END
Here, a and c must be kept alive for both loops, but b only is
needed for a few instructions. However, in a single pass state tracker
I have to keep all the information for a, b, and c until the end of the
program, because only then I can discard the loop information for b,
and resolve everything else for a and c.
With the approach I am using, i.e. only collecting all the access
information in the pass over the program, and resolving the life-times
at the end by re-playing the temp-access timeline, the above can be
achieved with less hassle, because I don't need to track per temporary
information for all the temporaries all the time, instead, I only need
to resolve loops and scopes when it is really needed.
[snip]
To me the algorithm you lined out looks quite complicated, but not too
different from what I'm doing when I replay the access time-line of a
register. However, with your approach one has to track the state of
each variable all the time, information that could be shared otherwise
and might not even needed.
[snip]
> >
> > I hope I didn't miss anything, because after all this is
> > admittedly subtle stuff.
It is, and this is why I think it is better to separate the steps into
manageable chunks that can be put under test. (I admit, I'm a fan of
test driven development, and for that I also think it is more important
to write test cases instead of sketching out algorithms).
> > Still, I think this kind of state-machine approach should
> > work and allow you to avoid *lots* of allocations and pointer-
> > chasing.
The allocations I can (and will) get rid of, but I don't see some
pointer-chasing as a problem, since it is encapsulated within class
methods.
I thank you for your comments, but given that my code is working I
don't see that re-doing it from scratch is such a good idea. I think
refactoring it to eliminate the allocations and covering additional
test cases is a better approach. If this makes it possible to move the
implementation to be single pass, then I might consider it, but I think
tracking all the information for all temporaries all the time is not
such a good idea, especially for large shaders that might have 2000+
temporaries before register merging.
Best,
Gert
More information about the mesa-dev
mailing list