[Mesa-dev] [PATCH] st/glsl_to_tgsi: drop the merge_registers() pass

Ilia Mirkin imirkin at alum.mit.edu
Tue Apr 25 02:53:40 UTC 2017


It would kill nv30, I believe.

On Apr 24, 2017 10:30 PM, "Roland Scheidegger" <sroland at vmware.com> wrote:

> Am 24.04.2017 um 23:12 schrieb Rob Clark:
> > so I guess this is likely to hurt pipe drivers that don't (yet?)
> > have a real compiler backend.  (Ie. etnaviv and freedreno/a2xx.)  So
> > maybe it should be optional.
> I suppose softpipe, too? Though that's fine, noone cares if it gets a
> bit slower. Might even be nicer for debugging purposes...
>
> Roland
>
>
>
> > Also I wonder about the pre-llvm radeon gen's, since sb uses the
> > actual instruction encoding for IR between tgsi->sb and backend opt
> > passes..  iirc they have had problems when the tgsi code uses too
> > many registers.
> >
> > BR, -R
> >
> > On Mon, Apr 24, 2017 at 5:01 PM, Samuel Pitoiset
> > <samuel.pitoiset at gmail.com> wrote:
> >> The main goal of this pass to merge temporary registers in order to
> >> reduce the total number of registers and also to produce optimal
> >> TGSI code.
> >>
> >> In fact, compilers seem to be confused when temporary variables are
> >> already merged, maybe because it's done too early in the process.
> >>
> >> Removing the pass, reduce both the register pressure and the code
> >> size (TGSI is no longer optimized, but who cares?). shader-db
> >> results with RadeonSI and Nouveau are interesting.
> >>
> >> Nouveau:
> >>
> >> total instructions in shared programs : 3931608 -> 3929463
> >> (-0.05%) total gprs used in shared programs    : 481255 -> 479014
> >> (-0.47%) total local used in shared programs   : 27481 -> 27381
> >> (-0.36%) total bytes used in shared programs   : 36031256 ->
> >> 36011120 (-0.06%)
> >>
> >> local        gpr       inst      bytes helped          14
> >> 1471        1309        1309 hurt           1          88
> >> 384         384
> >>
> >> RadeonSI:
> >>
> >> PERCENTAGE DELTAS    Shaders     SGPRs     VGPRs SpillSGPR
> >> SpillVGPR  PrivVGPR   Scratch  CodeSize  MaxWaves     Waits
> >> ------------------------------------------------------------
> ----------------------------------------------------------
> >>
> >>
> All affected            4906   -0.31 %   -0.40 %   -2.93 %  -20.00 %
> .      -20.00 %   -0.18 %    0.19 %     .
> >> ------------------------------------------------------------
> ----------------------------------------------------------
> >>
> >>
> Total                  47109   -0.04 %   -0.05 %   -1.97 %   -7.14 %
> .       -0.30 %   -0.03 %    0.02 %     .
> >>
> >> Found by luck while fixing an issue in the TGSI dead code
> >> elimination pass which affects tex instructions with bindless
> >> samplers.
> >>
> >> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com> ---
> >> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 62
> >> ------------------------------ 1 file changed, 62 deletions(-)
> >>
> >> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> >> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index
> >> de7fe7837a..d033bdcc5a 100644 ---
> >> a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++
> >> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -565,7 +565,6 @@
> >> public: int eliminate_dead_code(void);
> >>
> >> void merge_two_dsts(void); -   void merge_registers(void); void
> >> renumber_registers(void);
> >>
> >> void emit_block_mov(ir_assignment *ir, const struct glsl_type
> >> *type, @@ -5262,66 +5261,6 @@
> >> glsl_to_tgsi_visitor::merge_two_dsts(void) } }
> >>
> >> -/* Merges temporary registers together where possible to reduce
> >> the number of - * registers needed to run a program. - * - *
> >> Produces optimal code only after copy propagation and dead code
> >> elimination - * have been run. */ -void
> >> -glsl_to_tgsi_visitor::merge_registers(void) -{ -   int *last_reads
> >> = rzalloc_array(mem_ctx, int, this->next_temp); -   int
> >> *first_writes = rzalloc_array(mem_ctx, int, this->next_temp); -
> >> struct rename_reg_pair *renames = rzalloc_array(mem_ctx, struct
> >> rename_reg_pair, this->next_temp); -   int i, j; -   int
> >> num_renames = 0; - -   /* Read the indices of the last read and
> >> first write to each temp register -    * into an array so that we
> >> don't have to traverse the instruction list as -    * much. */ -
> >> for (i = 0; i < this->next_temp; i++) { -      last_reads[i] = -1;
> >> -      first_writes[i] = -1; -   } -
> >> get_last_temp_read_first_temp_write(last_reads, first_writes); - -
> >> /* Start looking for registers with non-overlapping usages that can
> >> be -    * merged together. */ -   for (i = 0; i < this->next_temp;
> >> i++) { -      /* Don't touch unused registers. */ -      if
> >> (last_reads[i] < 0 || first_writes[i] < 0) continue; - -      for
> >> (j = 0; j < this->next_temp; j++) { -         /* Don't touch unused
> >> registers. */ -         if (last_reads[j] < 0 || first_writes[j] <
> >> 0) continue; - -         /* We can merge the two registers if the
> >> first write to j is after or -          * in the same instruction
> >> as the last read from i.  Note that the -          * register at
> >> index i will always be used earlier or at the same time -
> >> * as the register at index j. */ -         if (first_writes[i] <=
> >> first_writes[j] && -             last_reads[i] <= first_writes[j])
> >> { -            renames[num_renames].old_reg = j; -
> >> renames[num_renames].new_reg = i; -            num_renames++; - -
> >> /* Update the first_writes and last_reads arrays with the new -
> >> * values for the merged register index, and mark the newly unused -
> >> * register index as such. */ -            assert(last_reads[j] >=
> >> last_reads[i]); -            last_reads[i] = last_reads[j]; -
> >> first_writes[j] = -1; -            last_reads[j] = -1; -         }
> >> -      } -   } - -   rename_temp_registers(num_renames, renames); -
> >> ralloc_free(renames); -   ralloc_free(last_reads); -
> >> ralloc_free(first_writes); -} - /* Reassign indices to temporary
> >> registers by reusing unused indices created * by optimization
> >> passes. */ void @@ -6712,7 +6651,6 @@ get_mesa_program_tgsi(struct
> >> gl_context *ctx, while (v->eliminate_dead_code());
> >>
> >> v->merge_two_dsts(); -   v->merge_registers();
> >> v->renumber_registers();
> >>
> >> /* Write the END instruction. */ -- 2.12.2
> >>
> >> _______________________________________________ mesa-dev mailing
> >> list mesa-dev at lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > _______________________________________________ mesa-dev mailing
> > list mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170424/4d6e9b90/attachment-0001.html>


More information about the mesa-dev mailing list