[Mesa-dev] [PATCH] st/glsl_to_tgsi: drop the merge_registers() pass
Rob Clark
robdclark at gmail.com
Mon Apr 24 21:12:16 UTC 2017
so I guess this is likely to hurt pipe drivers that don't (yet?) have
a real compiler backend. (Ie. etnaviv and freedreno/a2xx.) So maybe
it should be optional.
Also I wonder about the pre-llvm radeon gen's, since sb uses the
actual instruction encoding for IR between tgsi->sb and backend opt
passes.. iirc they have had problems when the tgsi code uses too many
registers.
BR,
-R
On Mon, Apr 24, 2017 at 5:01 PM, Samuel Pitoiset
<samuel.pitoiset at gmail.com> wrote:
> The main goal of this pass to merge temporary registers in order
> to reduce the total number of registers and also to produce
> optimal TGSI code.
>
> In fact, compilers seem to be confused when temporary variables
> are already merged, maybe because it's done too early in the
> process.
>
> Removing the pass, reduce both the register pressure and the code
> size (TGSI is no longer optimized, but who cares?).
> shader-db results with RadeonSI and Nouveau are interesting.
>
> Nouveau:
>
> total instructions in shared programs : 3931608 -> 3929463 (-0.05%)
> total gprs used in shared programs : 481255 -> 479014 (-0.47%)
> total local used in shared programs : 27481 -> 27381 (-0.36%)
> total bytes used in shared programs : 36031256 -> 36011120 (-0.06%)
>
> local gpr inst bytes
> helped 14 1471 1309 1309
> hurt 1 88 384 384
>
> RadeonSI:
>
> PERCENTAGE DELTAS Shaders SGPRs VGPRs SpillSGPR SpillVGPR PrivVGPR Scratch CodeSize MaxWaves Waits
> ----------------------------------------------------------------------------------------------------------------------
> All affected 4906 -0.31 % -0.40 % -2.93 % -20.00 % . -20.00 % -0.18 % 0.19 % .
> ----------------------------------------------------------------------------------------------------------------------
> Total 47109 -0.04 % -0.05 % -1.97 % -7.14 % . -0.30 % -0.03 % 0.02 % .
>
> Found by luck while fixing an issue in the TGSI dead code elimination
> pass which affects tex instructions with bindless samplers.
>
> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
> ---
> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 62 ------------------------------
> 1 file changed, 62 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index de7fe7837a..d033bdcc5a 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -565,7 +565,6 @@ public:
> int eliminate_dead_code(void);
>
> void merge_two_dsts(void);
> - void merge_registers(void);
> void renumber_registers(void);
>
> void emit_block_mov(ir_assignment *ir, const struct glsl_type *type,
> @@ -5262,66 +5261,6 @@ glsl_to_tgsi_visitor::merge_two_dsts(void)
> }
> }
>
> -/* Merges temporary registers together where possible to reduce the number of
> - * registers needed to run a program.
> - *
> - * Produces optimal code only after copy propagation and dead code elimination
> - * have been run. */
> -void
> -glsl_to_tgsi_visitor::merge_registers(void)
> -{
> - int *last_reads = rzalloc_array(mem_ctx, int, this->next_temp);
> - int *first_writes = rzalloc_array(mem_ctx, int, this->next_temp);
> - struct rename_reg_pair *renames = rzalloc_array(mem_ctx, struct rename_reg_pair, this->next_temp);
> - int i, j;
> - int num_renames = 0;
> -
> - /* Read the indices of the last read and first write to each temp register
> - * into an array so that we don't have to traverse the instruction list as
> - * much. */
> - for (i = 0; i < this->next_temp; i++) {
> - last_reads[i] = -1;
> - first_writes[i] = -1;
> - }
> - get_last_temp_read_first_temp_write(last_reads, first_writes);
> -
> - /* Start looking for registers with non-overlapping usages that can be
> - * merged together. */
> - for (i = 0; i < this->next_temp; i++) {
> - /* Don't touch unused registers. */
> - if (last_reads[i] < 0 || first_writes[i] < 0) continue;
> -
> - for (j = 0; j < this->next_temp; j++) {
> - /* Don't touch unused registers. */
> - if (last_reads[j] < 0 || first_writes[j] < 0) continue;
> -
> - /* We can merge the two registers if the first write to j is after or
> - * in the same instruction as the last read from i. Note that the
> - * register at index i will always be used earlier or at the same time
> - * as the register at index j. */
> - if (first_writes[i] <= first_writes[j] &&
> - last_reads[i] <= first_writes[j]) {
> - renames[num_renames].old_reg = j;
> - renames[num_renames].new_reg = i;
> - num_renames++;
> -
> - /* Update the first_writes and last_reads arrays with the new
> - * values for the merged register index, and mark the newly unused
> - * register index as such. */
> - assert(last_reads[j] >= last_reads[i]);
> - last_reads[i] = last_reads[j];
> - first_writes[j] = -1;
> - last_reads[j] = -1;
> - }
> - }
> - }
> -
> - rename_temp_registers(num_renames, renames);
> - ralloc_free(renames);
> - ralloc_free(last_reads);
> - ralloc_free(first_writes);
> -}
> -
> /* Reassign indices to temporary registers by reusing unused indices created
> * by optimization passes. */
> void
> @@ -6712,7 +6651,6 @@ get_mesa_program_tgsi(struct gl_context *ctx,
> while (v->eliminate_dead_code());
>
> v->merge_two_dsts();
> - v->merge_registers();
> v->renumber_registers();
>
> /* Write the END instruction. */
> --
> 2.12.2
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list