Mesa (main): aco: Use Navi 10 empty NGG output workaround on NGG culling shaders.

GitLab Mirror gitlab-mirror at kemper.freedesktop.org
Wed Aug 4 12:47:30 UTC 2021


Module: Mesa
Branch: main
Commit: 448592b9aeb471772bd696fd44e4f952b8f492b6
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=448592b9aeb471772bd696fd44e4f952b8f492b6

Author: Timur Kristóf <timur.kristof at gmail.com>
Date:   Mon Aug  2 16:48:41 2021 +0200

aco: Use Navi 10 empty NGG output workaround on NGG culling shaders.

Navi 10 can hang when an NGG workgroup has no output,
so we work around that by always exporting a single zero-area
triangle with a single vertex that has all-NaN coordinates.

Thus far, we only employed this for NGG GS, because on all
other stages, the output can't be empty.

However, with NGG culling, the output can be empty, so let's
apply the same workaround there too.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof at gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02 at gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12169>

---

 src/amd/compiler/aco_instruction_selection.cpp | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/amd/compiler/aco_instruction_selection.cpp b/src/amd/compiler/aco_instruction_selection.cpp
index 0f1090e13e2..4b3a913b860 100644
--- a/src/amd/compiler/aco_instruction_selection.cpp
+++ b/src/amd/compiler/aco_instruction_selection.cpp
@@ -11558,8 +11558,11 @@ ngg_emit_sendmsg_gs_alloc_req(isel_context* ctx, Temp vtx_cnt, Temp prm_cnt)
    Builder bld(ctx->program, ctx->block);
    Temp prm_cnt_0;
 
-   if (ctx->program->chip_class == GFX10 && ctx->stage.has(SWStage::GS)) {
-      /* Navi 1x workaround: make sure to always export at least 1 vertex and triangle */
+   if (ctx->program->chip_class == GFX10 &&
+       (ctx->stage.has(SWStage::GS) || ctx->program->info->has_ngg_culling)) {
+      /* Navi 1x workaround: check whether the workgroup has no output.
+       * If so, change the number of exported vertices and primitives to 1.
+       */
       prm_cnt_0 = bld.sopc(aco_opcode::s_cmp_eq_u32, bld.def(s1, scc), prm_cnt, Operand::zero());
       prm_cnt = bld.sop2(aco_opcode::s_cselect_b32, bld.def(s1), Operand::c32(1u), prm_cnt,
                          bld.scc(prm_cnt_0));
@@ -11573,11 +11576,12 @@ ngg_emit_sendmsg_gs_alloc_req(isel_context* ctx, Temp vtx_cnt, Temp prm_cnt)
    tmp = bld.sop2(aco_opcode::s_or_b32, bld.m0(bld.def(s1)), bld.def(s1, scc), tmp, vtx_cnt);
 
    /* Request the SPI to allocate space for the primitives and vertices
-    * that will be exported by the threadgroup. */
+    * that will be exported by the threadgroup.
+    */
    bld.sopp(aco_opcode::s_sendmsg, bld.m0(tmp), -1, sendmsg_gs_alloc_req);
 
    if (prm_cnt_0.id()) {
-      /* Navi 1x workaround: export a triangle with NaN coordinates when GS has no output.
+      /* Navi 1x workaround: export a triangle with NaN coordinates when NGG has no output.
        * It can't have all-zero positions because that would render an undesired pixel with
        * conservative rasterization.
        */



More information about the mesa-commit mailing list