[Mesa-dev] [PATCH] st/glsl_to_tgsi: drop the merge_registers() pass
Roland Scheidegger
sroland at vmware.com
Tue Apr 25 02:29:44 UTC 2017
Am 24.04.2017 um 23:12 schrieb Rob Clark:
> so I guess this is likely to hurt pipe drivers that don't (yet?)
> have a real compiler backend. (Ie. etnaviv and freedreno/a2xx.) So
> maybe it should be optional.
I suppose softpipe, too? Though that's fine, noone cares if it gets a
bit slower. Might even be nicer for debugging purposes...
Roland
> Also I wonder about the pre-llvm radeon gen's, since sb uses the
> actual instruction encoding for IR between tgsi->sb and backend opt
> passes.. iirc they have had problems when the tgsi code uses too
> many registers.
>
> BR, -R
>
> On Mon, Apr 24, 2017 at 5:01 PM, Samuel Pitoiset
> <samuel.pitoiset at gmail.com> wrote:
>> The main goal of this pass to merge temporary registers in order to
>> reduce the total number of registers and also to produce optimal
>> TGSI code.
>>
>> In fact, compilers seem to be confused when temporary variables are
>> already merged, maybe because it's done too early in the process.
>>
>> Removing the pass, reduce both the register pressure and the code
>> size (TGSI is no longer optimized, but who cares?). shader-db
>> results with RadeonSI and Nouveau are interesting.
>>
>> Nouveau:
>>
>> total instructions in shared programs : 3931608 -> 3929463
>> (-0.05%) total gprs used in shared programs : 481255 -> 479014
>> (-0.47%) total local used in shared programs : 27481 -> 27381
>> (-0.36%) total bytes used in shared programs : 36031256 ->
>> 36011120 (-0.06%)
>>
>> local gpr inst bytes helped 14
>> 1471 1309 1309 hurt 1 88
>> 384 384
>>
>> RadeonSI:
>>
>> PERCENTAGE DELTAS Shaders SGPRs VGPRs SpillSGPR
>> SpillVGPR PrivVGPR Scratch CodeSize MaxWaves Waits
>> ----------------------------------------------------------------------------------------------------------------------
>>
>>
All affected 4906 -0.31 % -0.40 % -2.93 % -20.00 %
. -20.00 % -0.18 % 0.19 % .
>> ----------------------------------------------------------------------------------------------------------------------
>>
>>
Total 47109 -0.04 % -0.05 % -1.97 % -7.14 %
. -0.30 % -0.03 % 0.02 % .
>>
>> Found by luck while fixing an issue in the TGSI dead code
>> elimination pass which affects tex instructions with bindless
>> samplers.
>>
>> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com> ---
>> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 62
>> ------------------------------ 1 file changed, 62 deletions(-)
>>
>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index
>> de7fe7837a..d033bdcc5a 100644 ---
>> a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++
>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -565,7 +565,6 @@
>> public: int eliminate_dead_code(void);
>>
>> void merge_two_dsts(void); - void merge_registers(void); void
>> renumber_registers(void);
>>
>> void emit_block_mov(ir_assignment *ir, const struct glsl_type
>> *type, @@ -5262,66 +5261,6 @@
>> glsl_to_tgsi_visitor::merge_two_dsts(void) } }
>>
>> -/* Merges temporary registers together where possible to reduce
>> the number of - * registers needed to run a program. - * - *
>> Produces optimal code only after copy propagation and dead code
>> elimination - * have been run. */ -void
>> -glsl_to_tgsi_visitor::merge_registers(void) -{ - int *last_reads
>> = rzalloc_array(mem_ctx, int, this->next_temp); - int
>> *first_writes = rzalloc_array(mem_ctx, int, this->next_temp); -
>> struct rename_reg_pair *renames = rzalloc_array(mem_ctx, struct
>> rename_reg_pair, this->next_temp); - int i, j; - int
>> num_renames = 0; - - /* Read the indices of the last read and
>> first write to each temp register - * into an array so that we
>> don't have to traverse the instruction list as - * much. */ -
>> for (i = 0; i < this->next_temp; i++) { - last_reads[i] = -1;
>> - first_writes[i] = -1; - } -
>> get_last_temp_read_first_temp_write(last_reads, first_writes); - -
>> /* Start looking for registers with non-overlapping usages that can
>> be - * merged together. */ - for (i = 0; i < this->next_temp;
>> i++) { - /* Don't touch unused registers. */ - if
>> (last_reads[i] < 0 || first_writes[i] < 0) continue; - - for
>> (j = 0; j < this->next_temp; j++) { - /* Don't touch unused
>> registers. */ - if (last_reads[j] < 0 || first_writes[j] <
>> 0) continue; - - /* We can merge the two registers if the
>> first write to j is after or - * in the same instruction
>> as the last read from i. Note that the - * register at
>> index i will always be used earlier or at the same time -
>> * as the register at index j. */ - if (first_writes[i] <=
>> first_writes[j] && - last_reads[i] <= first_writes[j])
>> { - renames[num_renames].old_reg = j; -
>> renames[num_renames].new_reg = i; - num_renames++; - -
>> /* Update the first_writes and last_reads arrays with the new -
>> * values for the merged register index, and mark the newly unused -
>> * register index as such. */ - assert(last_reads[j] >=
>> last_reads[i]); - last_reads[i] = last_reads[j]; -
>> first_writes[j] = -1; - last_reads[j] = -1; - }
>> - } - } - - rename_temp_registers(num_renames, renames); -
>> ralloc_free(renames); - ralloc_free(last_reads); -
>> ralloc_free(first_writes); -} - /* Reassign indices to temporary
>> registers by reusing unused indices created * by optimization
>> passes. */ void @@ -6712,7 +6651,6 @@ get_mesa_program_tgsi(struct
>> gl_context *ctx, while (v->eliminate_dead_code());
>>
>> v->merge_two_dsts(); - v->merge_registers();
>> v->renumber_registers();
>>
>> /* Write the END instruction. */ -- 2.12.2
>>
>> _______________________________________________ mesa-dev mailing
>> list mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________ mesa-dev mailing
> list mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
More information about the mesa-dev
mailing list