[Mesa-dev] [PATCH v3 02/13] mesa/st: glsl_to_tgsi: Split arrays who's elements are only accessed directly
Gert Wollny
gw.fossdev at gmail.com
Tue May 1 16:13:22 UTC 2018
Am Dienstag, den 01.05.2018, 11:57 +0200 schrieb Nicolai Hähnle:
> So the GLSL transforms don't already do this? Interesting... anyway,
> seems a nice improvement,
When I first sent this patch stand-alone there were some comments about
this:
https://patchwork.freedesktop.org/patch/189842/[¹
BTW: as you might have seen Benedikt Schemmer did quite some testing of
the series enabling register-merge on radeonsi. One thing that came up
was failing piglits with bindless textures. From what I currently see
all drivers that supports bindless textures actually skip
register_merge and my assumption was that I should enable this code
with the same restrictions - mainly because I assume that the LLVM
back-end takes care of these things anyway.
But of course I would like to make this code correct with respect to
this feature too, so I was wondering whether it is sufficient to apply
the final register renaming that stems from this code
to glsl_to_tgsi_instruction::resource or whether there is something
more I should take care of? (I don't have the hardware to test this).
many thanks,
Gert
>
> On 28.04.2018 21:30, Gert Wollny wrote:
> > Array who's elements are only accessed directly are replaced by the
> > according number of temporary registers. By doing so the otherwise
> > reserved register range becomes subject to further optimizations
> > like
> > copy propagation and register merging.
> >
> > Thanks to the resulting reduced register pressure this patch makes
> > the piglits
> >
> > spec/glsl-1.50/execution -
> > variable-indexing/vs-output-array-vec3-index-wr-before-gs
> > geometry/max-input-components
> >
> > pass on r600 (barts) where they would fail before with a "GPR limit
> > exceeded"
> > error.
> >
> > v3: * enable this optimization only if the driver requests register
> > merge
> >
> > v2: * rename method dissolve_arrays to split_arrays
> > * unify the tracking and remapping methods for src and st
> > registers
> > * also track access to arrays via reladdr*
> >
> > Signed-off-by: Gert Wollny <gw.fossdev at gmail.com>
> > ---
> > src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 117
> > ++++++++++++++++++++++++++++-
> > 1 file changed, 116 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> > b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> > index 1614367a01..e778807b7c 100644
> > --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> > +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> > @@ -369,6 +369,7 @@ public:
> > void copy_propagate(void);
> > int eliminate_dead_code(void);
> >
> > + void split_arrays(void);
> > void merge_two_dsts(void);
> > void merge_registers(void);
> > void renumber_registers(void);
> > @@ -5400,6 +5401,110 @@ glsl_to_tgsi_visitor::merge_two_dsts(void)
> > }
> > }
> >
> > +
> > +
> > +/* One-dimensional arrays who's elements are only accessed
> > directly are
>
> *whose
>
> Also, the placement of this comment is odd, floating like this in
> empty
> space. It looks like it should be a comment on split_arrays.
>
>
> > + * replaced by an according set of temporary registers that then
> > can become
> > + * subject to further optimization steps like copy propagation and
> > + * register merging.
> > + */
> > +
> > +template <typename st_reg>
> > +void test_indirect_access(const st_reg& reg, bool
> > *has_indirect_access)
> > +{
> > + if (reg.file == PROGRAM_ARRAY) {
> > + if (reg.reladdr || reg.reladdr2 || reg.has_index2) {
> > + has_indirect_access[reg.array_id] = true;
> > + if (reg.reladdr)
> > + test_indirect_access(*reg.reladdr,
> > has_indirect_access);
> > + if (reg.reladdr2)
> > + test_indirect_access(*reg.reladdr,
> > has_indirect_access);
> > + }
> > + }
> > +}
> > +
> > +template <typename st_reg>
> > +void remap_array(st_reg& reg, const int *array_remap_info,
> > + const bool *has_indirect_access)
> > +{
> > + if (reg.file == PROGRAM_ARRAY) {
> > + if (!has_indirect_access[reg.array_id]) {
> > + reg.file = PROGRAM_TEMPORARY;
> > + reg.index = reg.index + array_remap_info[reg.array_id];
> > + reg.array_id = 0;
> > + } else {
> > + reg.array_id = array_remap_info[reg.array_id];
> > + }
> > +
> > + if (reg.reladdr)
> > + remap_array(*reg.reladdr, array_remap_info,
> > has_indirect_access);
> > +
> > + if (reg.reladdr2)
> > + remap_array(*reg.reladdr2, array_remap_info,
> > has_indirect_access);
> > + }
> > +}
> > +
> > +void
> > +glsl_to_tgsi_visitor::split_arrays(void)
> > +{
> > + if (!next_array)
> > + return;
> > +
> > + bool *has_indirect_access = rzalloc_array(mem_ctx, bool,
> > next_array + 1);
> > +
> > + foreach_in_list(glsl_to_tgsi_instruction, inst, &this-
> > >instructions) {
> > + for (unsigned j = 0; j < num_inst_src_regs(inst); j++)
> > + test_indirect_access(inst->src[j], has_indirect_access);
> > +
> > + for (unsigned j = 0; j < inst->tex_offset_num_offset; j++)
> > + test_indirect_access(inst->tex_offsets[j],
> > has_indirect_access);
> > +
> > + for (unsigned j = 0; j < num_inst_dst_regs(inst); j++)
> > + test_indirect_access(inst->dst[j], has_indirect_access);
> > + }
> > +
> > + unsigned array_offset = 0;
> > + unsigned n_remaining_arrays = 0;
> > +
> > + /* Double use: For arrays that get disolved this value will
> > contain
>
> *dissolved
>
>
> > + * the base index of the temporary registers this array is
> > replaced
> > + * with. For arrays that remain it contains the new array ID.
> > + */
> > + int *array_remap_info = rzalloc_array(has_indirect_access, int,
> > + next_array + 1);
> > +
> > + for (unsigned i = 1; i <= next_array; ++i) {
> > + if (!has_indirect_access[i]) {
> > + array_remap_info[i] = this->next_temp + array_offset;
> > + array_offset += array_sizes[i-1];
>
> Spaces around operators are generally preferred.
>
>
> > + } else {
> > + array_sizes[n_remaining_arrays] = array_sizes[i-1];
> > + array_remap_info[i] = ++n_remaining_arrays;
> > + }
> > + }
> > +
> > + if (next_array != n_remaining_arrays) {
> > +
>
> Excessive whitespace: both the empty line and after the !=. Also
> twice
> below.
>
> Cheers,
> Nicolai
>
>
> > + foreach_in_list(glsl_to_tgsi_instruction, inst, &this-
> > >instructions) {
> > +
> > + for (unsigned j = 0; j < num_inst_src_regs(inst); j++)
> > + remap_array(inst->src[j], array_remap_info,
> > has_indirect_access);
> > +
> > + for (unsigned j = 0; j < inst->tex_offset_num_offset;
> > j++)
> > + remap_array(inst->tex_offsets[j], array_remap_info,
> > has_indirect_access);
> > +
> > + for (unsigned j = 0; j < num_inst_dst_regs(inst); j++) {
> > + remap_array(inst->dst[j], array_remap_info,
> > has_indirect_access);
> > + }
> > + }
> > + }
> > +
> > + ralloc_free(has_indirect_access);
> > +
> > + this->next_temp += array_offset;
> > + next_array = n_remaining_arrays;
> > +}
> > +
> > /* Merges temporary registers together where possible to reduce
> > the number of
> > * registers needed to run a program.
> > *
> > @@ -6892,8 +6997,18 @@ get_mesa_program_tgsi(struct gl_context
> > *ctx,
> > while (v->eliminate_dead_code());
> >
> > v->merge_two_dsts();
> > - if (!skip_merge_registers)
> > +
> > + if (!skip_merge_registers) {
> > +
> > + v->split_arrays();
> > + v->copy_propagate();
> > + while (v->eliminate_dead_code());
> > +
> > v->merge_registers();
> > + v->copy_propagate();
> > + while (v->eliminate_dead_code());
> > + }
> > +
> > v->renumber_registers();
> >
> > /* Write the END instruction. */
> >
>
>
More information about the mesa-dev
mailing list