[Mesa-dev] [PATCH v9 0/7] mesa/st: glsl_to_tgsi: refined register merge algorithm

Dieter Nützel Dieter at nuetzel-hh.de
Wed Sep 6 15:00:39 UTC 2017


Am 06.09.2017 11:53, schrieb Nicolai Hähnle:
> I finally went over this again, and I think it's good enough to go in.
> R-b and pushed!
> 
> Thanks for making all the adjustments and your patience :)
> 
> I did notice some minor things, I'm about to send out patches to clean
> things up afterwards.
> 
> Cheers,
> Nicolai

Wow,

this IS GREAT news!
But you've lost all my T-b ... ;-)))

What I've found and didn't mailed before is, that I see ~50 KB smaller 
size off the *dr[i|v].so files with Gert's nice stuff.

Thank you very much for this, Gert!

I'll swap my Turks XT/6670 back in and test LS2015 on it again, too 
verify if the missing textures (people) appear, now.

Greetings,
Dieter


> On 24.08.2017 19:38, Gert Wollny wrote:
>> Dear all,
>> 
>> I thought I might send out this patch another time with its full 
>> history and
>> freshly rebased. All the changes that I applied were a result of 
>> reviews by
>> Nicolai (mostly) and Emil (thanks again to both of you).
>> 
>> The set is mirroed at
>>     https://github.com/gerddie/mesa/tree/regrename-v9
>> 
>> The patch fixes a series of bugs where shader compilation fails with
>>    "translation from TGSI failed!"
>> Among these are
>>    * https://bugs.freedesktop.org/show_bug.cgi?id=65448 which
>>      I can confirm will be fixed for R600_DEBUG=nosb set (with sb 
>> enabled it will
>>      fail with a failing assertion in the sb code).
>>    * According to a user report against v5, the patch also fixes 
>> #99349
>> 
>> I can also confirm that the patch fixes the Piano and Voloplosion 
>> benchmarks
>> implemented in gputest on BARTS (r600g).
>> 
>> The patch has no significant impact on runtime - not taking Dave's 
>> patch into
>> account that in itself reduces the register renaming run-time for 
>> shaders
>> with a large numbers of temporary registers.
>> 
>> The patch doesn't introduce piglit regression (I tested the shader 
>> subset).
>> spec at glsl-1.50@execution at variable-indexing@gs-input-array-vec2-index-rd 
>> is
>> fixed though.
>> 
>> The algorithm works like follows:
>> - first the program is scanned, the loops, switch and if/else scopes 
>> are
>>    collected and for each temporary first and last reads and writes 
>> and the
>>    according scopes are collected, and it is recorded whether a 
>> variable is
>>    written conditionally, and whether loops have continue or break 
>> statements.
>> - then after the whole program has been scanned, the life times are 
>> estimated
>>    by merging the read and write scopes for each temporary on a per 
>> component
>>    bases,
>> - the life-times of the cmponents are merged,
>> - the register mapping is evaluated, and
>> - the mapping is applied with the rename_temp_registers method 
>> implemented
>>    by Dave.
>> 
>> I've used the patches for quite some time now and so far I didn't 
>> encounter
>> any problems, many thanks for any comments,
>>   Gert
>> 
>> Patch history:
>> 
>> v2:* significantly cut down on the memory allocations,
>>     * expose only a minimal interface to register lifetime estimation 
>> and
>>       calculating the rename table,
>>     v3: was broken and v4 restarted from v2
>> 
>> v4:* split the changes into more patches
>>     * correct formatting errors,
>>     * remove the use of the STL with one exception though:
>>       since in st_glsl_to_tgsi.cpp std::sort is already used and its 
>> run-time
>>       performance is significantly better than qsort. It is used in 
>> the register
>>       rename mapping evaluation. It can be disabled by commenting out 
>> the define
>>       USE_STL_SORT in st_glsl_to_tgsi_temprename.cpp.
>>     * add more tests and improve the life-time evaluation accordingly,
>>     * further reduce memory allocations,
>>     * rename functions and methods to better clarify what they are 
>> used for,
>>     * remove unused methods and variables in prog_scope,
>>     * eliminate the class tgsi_temp_lifetime,
>>     * no longer require C++11 for the core library code, however, the 
>> tests
>>       make use of C++11 and the STL
>> 
>> v5: * correct formatting following Emil's suggetions
>>      * remove un-needed libraries for the tests
>> 
>> v6:* the components are now tracked individually and the life time of 
>> a temporary
>>       is evaluated by merging the life-times of their components,
>>     * BRK/CONT are now handled separately,
>>     * the final algorithm to evaluate the life-times was simplified,
>>     * read and write in the same instruction is now considered to be 
>> always
>>       well defined,
>>     * adherence to the coding stile was improved,
>>     * the case scope level is now below the according switch scope 
>> level,
>>     * the new register merge method replaces the old version, i.e. no 
>> environment
>>       variables to switch between implementations. In theory, one 
>> could also
>>       remove the function get_last_temp_read_first_temp_write, but is 
>> is still
>>       used in some code in a #define 0 block, so I didn't touch it.
>>     * when compiled in debug mode and with the environment variable
>>       GLSL_TO_TGSI_RENAME_DEBUG specified the TGSI and resulting 
>> register
>>       lifetimes will be dumped to stderr.Here Nicolai suggested to use
>>       _mesa_register_file_name instead of my hand-backed array of 
>> strings,
>>       but currently that function doesn't write out all the needed 
>> names,
>>       so I thought it might be better to address this in another patch 
>> that
>>       also extends _mesa_register_file_name,
>>     * unused registers are now ignored in the rename mapping 
>> evaluation,
>>     * registers that are only read get a life-time {x,x}, with x the 
>> instruction
>>       line were the register is last read, so they can be merged,
>>     * the patch has been rebased against 7d7bcd65d
>> 
>> v7:* Correct documentation of GLSL_TO_TGSI_RENAME_DEBUG in commit 
>> message.
>>     * fix typos, include files, formatting, and some variable names,
>>     * add anonymous namespace around classes,
>>     * replace debug_log singleton by a function,
>>     * cleanup the switch-case scope creation,
>>     * track switch variable only for SWITCH statement,
>>     * add test case with switch in loop that has a case where the 
>> register is
>>       written, then falls through where it is read and correct code 
>> accordingly,
>>       specifically, this showed that Nicolai was right that the scope 
>> of the
>>       case that falls through must not be extended,
>>     * add test case for more then one break in loop and make sure the 
>> first break
>>       is used for life-time tracking,
>>     * simplify access flag tracking and merging,
>>     * drop special handling of registers that are only read, i.e. 
>> treat them like
>>       unused registers since renumber_registers will do this anyway,
>>     * simplify the enclosing scope resolution,
>>     * handle TEMP[0] just like the others; somehow I had the idea that 
>> TEMP[0] is
>>       special and should not be touched, and the debug output running 
>> various
>>       programs and also from piglit shaders doesn't show TEMP[0] as 
>> used,
>>     * optimize renaming evaluation by only copying used registers to 
>> the
>>       reg_access array that is used for the evaluation,
>> 
>> 
>> v8:* added Dave Airlie's patch for register renaming slightly changed 
>> to
>>       account for the inst->resource_index since the reminder of the 
>> patch set
>>       makes use of his renaming methods. Since my mapping evaluator 
>> doesn't
>>       require  a recursive renaming strategy, the problem that made 
>> Dave
>>       revert this patch does not occure after applying the series,
>> 
>> v9:* add check to handle op-codes TGSI_OPCODE_CAL and TGSI_OPCODE_RET, 
>> i.e.
>>       if these opcondes are encountered that the register lifetime 
>> estimation
>>       will bailout and signal that no renaming can take place (because 
>>  subroutines
>>       are not tracked),
>>     * add array-id to debug output for PROGRAM_ARRAY,
>>     * rebase to 39f3e2507c6
>> 
>> Dave Airlie (1):
>>    st_glsl_to_tgsi: rewrite rename registers to use array fully.
>> 
>> Gert Wollny (6):
>>    mesa/st: glsl_to_tgsi move some helper classes to extra files
>>    mesa/st: glsl_to_tgsi: implement new temporary register lifetime
>>      tracker
>>    mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime
>>      tracker
>>    mesa/st: glsl_to_tgsi: add register rename mapping evaluator
>>    mesa/st: glsl_to_tgsi: Add test set for evaluation of rename 
>> mapping
>>    mesa/st: glsl_to_tgsi: tie in new temporary register merge approach
>> 
>>   configure.ac                                       |    1 +
>>   src/mesa/Makefile.am                               |    2 +-
>>   src/mesa/Makefile.sources                          |    4 +
>>   src/mesa/state_tracker/st_glsl_to_tgsi.cpp         |  400 +----
>>   src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp |  196 +++
>>   src/mesa/state_tracker/st_glsl_to_tgsi_private.h   |  168 ++
>>   .../state_tracker/st_glsl_to_tgsi_temprename.cpp   | 1008 
>> ++++++++++++
>>   .../state_tracker/st_glsl_to_tgsi_temprename.h     |   71 +
>>   src/mesa/state_tracker/tests/Makefile.am           |   36 +
>>   .../tests/test_glsl_to_tgsi_lifetime.cpp           | 1600 
>> ++++++++++++++++++++
>>   10 files changed, 3124 insertions(+), 362 deletions(-)
>>   create mode 100644 
>> src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp
>>   create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h
>>   create mode 100644 
>> src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
>>   create mode 100644 
>> src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h
>>   create mode 100644 src/mesa/state_tracker/tests/Makefile.am
>>   create mode 100644 
>> src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
>> 


More information about the mesa-dev mailing list