Mesa (staging/20.2): radeonsi: don't count unusable vertices to the NGG LDS size
GitLab Mirror
gitlab-mirror at kemper.freedesktop.org
Fri Aug 7 17:46:08 UTC 2020
Module: Mesa
Branch: staging/20.2
Commit: 3bf0368f9ea2e7764a22c75f1944cbbd07076681
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=3bf0368f9ea2e7764a22c75f1944cbbd07076681
Author: Marek Olšák <marek.olsak at amd.com>
Date: Thu Jul 30 08:19:48 2020 -0400
radeonsi: don't count unusable vertices to the NGG LDS size
Now we get optimal LDS usage.
Fixes: a23802bcb9a - ac,radeonsi: start adding support for gfx10.3
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer at amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6137>
(cherry picked from commit 68b3e92fef09330ac880e713a744dc7a57e78f05)
---
.pick_status.json | 2 +-
src/gallium/drivers/radeonsi/gfx10_shader_ngg.c | 14 +++++++++++---
2 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/.pick_status.json b/.pick_status.json
index e8f58d05a6d..169e3aef2f5 100644
--- a/.pick_status.json
+++ b/.pick_status.json
@@ -40,7 +40,7 @@
"description": "radeonsi: don't count unusable vertices to the NGG LDS size",
"nominated": true,
"nomination_type": 1,
- "resolution": 0,
+ "resolution": 1,
"master_sha": null,
"because_sha": "a23802bcb9a42a02d34a5a36d6e66d6532813a0d"
},
diff --git a/src/gallium/drivers/radeonsi/gfx10_shader_ngg.c b/src/gallium/drivers/radeonsi/gfx10_shader_ngg.c
index 0797f9cdb3a..efeb9e8838c 100644
--- a/src/gallium/drivers/radeonsi/gfx10_shader_ngg.c
+++ b/src/gallium/drivers/radeonsi/gfx10_shader_ngg.c
@@ -2027,9 +2027,15 @@ retry_select_mode:
max_gsprims = align(max_gsprims, wavesize);
max_gsprims = MIN2(max_gsprims, max_gsprims_base);
- if (gsprim_lds_size)
+ if (gsprim_lds_size) {
+ /* Don't count unusable vertices to the LDS size. Those are vertices above
+ * the maximum number of vertices that can occur in the workgroup,
+ * which is e.g. max_gsprims * 3 for triangles.
+ */
+ unsigned usable_esverts = MIN2(max_esverts, max_gsprims * max_verts_per_prim);
max_gsprims =
- MIN2(max_gsprims, (max_lds_size - max_esverts * esvert_lds_size) / gsprim_lds_size);
+ MIN2(max_gsprims, (max_lds_size - usable_esverts * esvert_lds_size) / gsprim_lds_size);
+ }
clamp_gsprims_to_esverts(&max_gsprims, max_esverts, min_verts_per_prim, use_adjacency);
assert(max_esverts >= max_verts_per_prim && max_gsprims >= 1);
} while (orig_max_esverts != max_esverts || orig_max_gsprims != max_gsprims);
@@ -2067,7 +2073,9 @@ retry_select_mode:
shader->ngg.prim_amp_factor = prim_amp_factor;
shader->ngg.max_vert_out_per_gs_instance = max_vert_out_per_gs_instance;
- shader->gs_info.esgs_ring_size = max_esverts * esvert_lds_size;
+ /* Don't count unusable vertices. */
+ shader->gs_info.esgs_ring_size = MIN2(max_esverts, max_gsprims * max_verts_per_prim) *
+ esvert_lds_size;
shader->ngg.ngg_emit_size = max_gsprims * gsprim_lds_size;
assert(shader->ngg.hw_max_esverts >= min_esverts); /* HW limitation */
More information about the mesa-commit
mailing list