[Mesa-dev] [PATCH v2] st/glsl_to_tgsi: fix renumber_registers() in presence of dead code
Nicolai Hähnle
nhaehnle at gmail.com
Thu May 4 09:40:27 UTC 2017
On 04.05.2017 11:04, Samuel Pitoiset wrote:
> The TGSI DCE pass doesn't eliminate dead assignments like
> MOV TEMP[0], TEMP[1] in presence of loops because it assumes
> that the visitor doesn't emit dead code. This assumption is
> actually wrong and this situation happens.
>
> However, it appears that the merge_registers() pass accidentally
> takes care of this for some weird reasons. But since this pass has
> been disabled for RadeonSI and Nouveau, the renumber_registers()
> pass which is called *after*, can't do its job correctly.
>
> This is because it assumes that no dead code is present. But if
> there is still a dead assignment, it might re-use the TEMP
> register id incorrectly and emits wrong code.
>
> This patches fixes the issue by recording writes instead of reads,
> and this has the advantage to be faster.
>
> This should fix Unigine Heaven on RadeonSI and Nouveau.
>
> shader-db results with RadeonSI:
>
> 47109 shaders in 29632 tests
> Totals:
> SGPRS: 1923308 -> 1923316 (0.00 %)
> VGPRS: 1133843 -> 1133847 (0.00 %)
> Spilled SGPRs: 2516 -> 2518 (0.08 %)
> Spilled VGPRs: 65 -> 65 (0.00 %)
> Private memory VGPRs: 1184 -> 1184 (0.00 %)
> Scratch size: 1308 -> 1308 (0.00 %) dwords per thread
> Code Size: 60095968 -> 60096256 (0.00 %) bytes
> LDS: 1077 -> 1077 (0.00 %) blocks
> Max Waves: 431889 -> 431889 (0.00 %)
> Wait states: 0 -> 0 (0.00 %)
>
> It's still interesting to disable the merge_registers() pass.
>
> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
> ---
> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 39 ++++++++++++++++++++++++++----
> 1 file changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index 249584d75e..c7ed599a67 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -559,6 +559,7 @@ public:
>
> void rename_temp_registers(int num_renames, struct rename_reg_pair *renames);
> void get_first_temp_read(int *first_reads);
> + void get_first_temp_write(int *first_writes);
> void get_last_temp_read_first_temp_write(int *last_reads, int *first_writes);
> void get_last_temp_write(int *last_writes);
>
> @@ -4759,6 +4760,33 @@ glsl_to_tgsi_visitor::rename_temp_registers(int num_renames, struct rename_reg_p
> }
>
> void
> +glsl_to_tgsi_visitor::get_first_temp_write(int *first_writes)
> +{
> + int depth = 0; /* loop depth */
> + int loop_start = -1; /* index of the first active BGNLOOP (if any) */
> + unsigned i = 0, j;
> +
> + foreach_in_list(glsl_to_tgsi_instruction, inst, &this->instructions) {
> + for (j = 0; j < num_inst_dst_regs(inst); j++) {
> + if (inst->dst[j].file == PROGRAM_TEMPORARY) {
> + if (first_writes[inst->dst[j].index] == -1)
> + first_writes[inst->dst[j].index] = (depth == 0) ? i : loop_start;
> + }
> + }
> +
> + if (inst->op == TGSI_OPCODE_BGNLOOP) {
> + if(depth++ == 0)
> + loop_start = i;
> + } else if (inst->op == TGSI_OPCODE_ENDLOOP) {
> + if (--depth == 0)
> + loop_start = -1;
> + }
> + assert(depth >= 0);
> + i++;
> + }
> +}
> +
> +void
> glsl_to_tgsi_visitor::get_first_temp_read(int *first_reads)
> {
> int depth = 0; /* loop depth */
> @@ -5347,16 +5375,17 @@ glsl_to_tgsi_visitor::renumber_registers(void)
> {
> int i = 0;
> int new_index = 0;
> - int *first_reads = rzalloc_array(mem_ctx, int, this->next_temp);
> + int *first_writes = rzalloc_array(mem_ctx, int, this->next_temp);
Since the array is initialized later anyway, you don't need rzalloc here.
The only possibility I see for something going wrong here is when we
have a temporary that is read from but not written to. That would
already lead to undefined behavior today, so... with the ralloc change:
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
> struct rename_reg_pair *renames = rzalloc_array(mem_ctx, struct rename_reg_pair, this->next_temp);
> int num_renames = 0;
> +
> for (i = 0; i < this->next_temp; i++) {
> - first_reads[i] = -1;
> + first_writes[i] = -1;
> }
> - get_first_temp_read(first_reads);
> + get_first_temp_write(first_writes);
>
> for (i = 0; i < this->next_temp; i++) {
> - if (first_reads[i] < 0) continue;
> + if (first_writes[i] < 0) continue;
> if (i != new_index) {
> renames[num_renames].old_reg = i;
> renames[num_renames].new_reg = new_index;
> @@ -5368,7 +5397,7 @@ glsl_to_tgsi_visitor::renumber_registers(void)
> rename_temp_registers(num_renames, renames);
> this->next_temp = new_index;
> ralloc_free(renames);
> - ralloc_free(first_reads);
> + ralloc_free(first_writes);
> }
>
> /* ------------------------- TGSI conversion stuff -------------------------- */
>
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
More information about the mesa-dev
mailing list