Mesa (master): aco: fix max_waves_per_simd on Polaris, VegaM and GFX10.3

GitLab Mirror gitlab-mirror at kemper.freedesktop.org
Tue Aug 4 22:04:28 UTC 2020


Module: Mesa
Branch: master
Commit: 37988b5b8ed05a7425d615d38dcc6243cf47036e
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=37988b5b8ed05a7425d615d38dcc6243cf47036e

Author: Rhys Perry <pendingchaos02 at gmail.com>
Date:   Thu Jun 18 14:30:51 2020 +0100

aco: fix max_waves_per_simd on Polaris, VegaM and GFX10.3

fossil-db (Polaris):
Totals from 20263 (14.75% of 137414) affected shaders:
SGPRs: 871407 -> 871679 (+0.03%); split: -0.00%, +0.03%
VGPRs: 513828 -> 550028 (+7.05%); split: -1.68%, +8.72%
CodeSize: 18869680 -> 18828148 (-0.22%); split: -0.23%, +0.01%
MaxWaves: 162012 -> 162030 (+0.01%); split: +0.01%, -0.00%
Instrs: 3629172 -> 3618817 (-0.29%); split: -0.30%, +0.02%
Cycles: 15682244 -> 15638244 (-0.28%); split: -0.30%, +0.02%
VMEM: 10675942 -> 10673344 (-0.02%); split: +0.18%, -0.21%
SMEM: 1209717 -> 1206088 (-0.30%); split: +0.03%, -0.33%
VClause: 81780 -> 81227 (-0.68%); split: -0.73%, +0.06%
SClause: 231724 -> 231561 (-0.07%); split: -0.07%, +0.00%
Copies: 187126 -> 180831 (-3.36%); split: -3.62%, +0.26%
Branches: 26841 -> 26837 (-0.01%); split: -0.03%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02 at gmail.com>
Reviewed-by: Daniel Schürmann <daniel at schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>

---

 src/amd/compiler/aco_live_var_analysis.cpp |  2 ++
 src/amd/compiler/aco_scheduler.cpp         | 14 +++++++++-----
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/src/amd/compiler/aco_live_var_analysis.cpp b/src/amd/compiler/aco_live_var_analysis.cpp
index 0378dbaf335..08f2d68c80a 100644
--- a/src/amd/compiler/aco_live_var_analysis.cpp
+++ b/src/amd/compiler/aco_live_var_analysis.cpp
@@ -337,6 +337,8 @@ void update_vgpr_sgpr_demand(Program* program, const RegisterDemand new_demand)
 {
    /* TODO: max_waves_per_simd, simd_per_cu and the number of physical vgprs for Navi */
    unsigned max_waves_per_simd = 10;
+   if ((program->family >= CHIP_POLARIS10 && program->family <= CHIP_VEGAM) || program->chip_class >= GFX10_3)
+      max_waves_per_simd = 8;
    unsigned simd_per_cu = 4;
 
    bool wgp = program->chip_class >= GFX10; /* assume WGP is used on Navi */
diff --git a/src/amd/compiler/aco_scheduler.cpp b/src/amd/compiler/aco_scheduler.cpp
index 102f0bf3ee6..40941e4c539 100644
--- a/src/amd/compiler/aco_scheduler.cpp
+++ b/src/amd/compiler/aco_scheduler.cpp
@@ -882,6 +882,11 @@ void schedule_block(sched_ctx& ctx, Program *program, Block* block, live& live_v
 
 void schedule_program(Program *program, live& live_vars)
 {
+   /* don't use program->max_reg_demand because that is affected by max_waves_per_simd */
+   RegisterDemand demand;
+   for (Block& block : program->blocks)
+      demand.update(block.register_demand);
+
    sched_ctx ctx;
    ctx.mv.depends_on.resize(program->peekAllocationId());
    ctx.mv.RAR_dependencies.resize(program->peekAllocationId());
@@ -891,15 +896,14 @@ void schedule_program(Program *program, live& live_vars)
     * seem to hurt anything else. */
    if (program->num_waves <= 5)
       ctx.num_waves = program->num_waves;
-   else if (program->max_reg_demand.vgpr >= 32)
+   else if (demand.vgpr >= 29)
       ctx.num_waves = 5;
-   else if (program->max_reg_demand.vgpr >= 28)
+   else if (demand.vgpr >= 25)
       ctx.num_waves = 6;
-   else if (program->max_reg_demand.vgpr >= 24)
-      ctx.num_waves = 7;
    else
-      ctx.num_waves = 8;
+      ctx.num_waves = 7;
    ctx.num_waves = std::max<uint16_t>(ctx.num_waves, program->min_waves);
+   ctx.num_waves = std::min<uint16_t>(ctx.num_waves, program->max_waves);
 
    assert(ctx.num_waves > 0 && ctx.num_waves <= program->num_waves);
    ctx.mv.max_registers = { int16_t(get_addr_vgpr_from_waves(program, ctx.num_waves) - 2),



More information about the mesa-commit mailing list