[PATCH] drm/amdgpu: don't invalidate caches in RELEASE_MEM, only do the writeback

Deucher, Alexander Alexander.Deucher at amd.com
Mon Jul 8 17:51:53 UTC 2019

Acked-by: Alex Deucher <alexander.deucher at amd.com>
From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> on behalf of Marek Olšák <maraeo at gmail.com>
Sent: Monday, July 8, 2019 1:31 PM
To: amd-gfx mailing list
Subject: Re: [PATCH] drm/amdgpu: don't invalidate caches in RELEASE_MEM, only do the writeback


On Tue, Jul 2, 2019 at 2:29 PM Marek Olšák <maraeo at gmail.com<mailto:maraeo at gmail.com>> wrote:
From: Marek Olšák <marek.olsak at amd.com<mailto:marek.olsak at amd.com>>

This RELEASE_MEM use has the Release semantic, which means we should write
back but not invalidate. Invalidations only make sense with the Acquire
semantic (ACQUIRE_MEM), or when RELEASE_MEM is used to do the combined
Acquire-Release semantic, which is a barrier, not a fence.

The undesirable side effect of doing invalidations for the Release semantic
is that it invalidates caches while shaders are running, because the Release
can execute in the middle of the next IB.

UMDs should use ACQUIRE_MEM at the beginning of IBs. Doing cache
invalidations for a fence (like in this case) doesn't do anything
for correctness.

Signed-off-by: Marek Olšák <marek.olsak at amd.com<mailto:marek.olsak at amd.com>>
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 210d24511dc6..a30f5d4913b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -4296,25 +4296,21 @@ static void gfx_v10_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr,
        bool int_sel = flags & AMDGPU_FENCE_FLAG_INT;

        /* Interrupt not work fine on GFX10.1 model yet. Use fallback instead */
        if (adev->pdev->device == 0x50)
                int_sel = false;

        /* RELEASE_MEM - flush caches, send int */
        amdgpu_ring_write(ring, PACKET3(PACKET3_RELEASE_MEM, 6));
        amdgpu_ring_write(ring, (PACKET3_RELEASE_MEM_GCR_SEQ |
                                 PACKET3_RELEASE_MEM_GCR_GL2_WB |
-                                PACKET3_RELEASE_MEM_GCR_GL2_INV |
-                                PACKET3_RELEASE_MEM_GCR_GL2_US |
-                                PACKET3_RELEASE_MEM_GCR_GL1_INV |
-                                PACKET3_RELEASE_MEM_GCR_GLV_INV |
-                                PACKET3_RELEASE_MEM_GCR_GLM_INV |
+                                PACKET3_RELEASE_MEM_GCR_GLM_INV | /* must be set with GLM_WB */
                                 PACKET3_RELEASE_MEM_GCR_GLM_WB |
                                 PACKET3_RELEASE_MEM_CACHE_POLICY(3) |
        amdgpu_ring_write(ring, (PACKET3_RELEASE_MEM_DATA_SEL(write64bit ? 2 : 1) |
                                 PACKET3_RELEASE_MEM_INT_SEL(int_sel ? 2 : 0)));

         * the address should be Qword aligned if 64bit write, Dword
         * aligned if only send 32bit data low (discard data high)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190708/ddc44451/attachment.html>

More information about the amd-gfx mailing list