<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Acked-by: Alex Deucher <alexander.deucher@amd.com><br>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Marek Olšák <maraeo@gmail.com><br>
<b>Sent:</b> Monday, July 8, 2019 1:31 PM<br>
<b>To:</b> amd-gfx mailing list<br>
<b>Subject:</b> Re: [PATCH] drm/amdgpu: don't invalidate caches in RELEASE_MEM, only do the writeback</font>
<div> </div>
</div>
<div>
<div dir="ltr">ping<br>
</div>
<br>
<div class="x_gmail_quote">
<div dir="ltr" class="x_gmail_attr">On Tue, Jul 2, 2019 at 2:29 PM Marek Olšák <<a href="mailto:maraeo@gmail.com">maraeo@gmail.com</a>> wrote:<br>
</div>
<blockquote class="x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
From: Marek Olšák <<a href="mailto:marek.olsak@amd.com" target="_blank">marek.olsak@amd.com</a>><br>
<br>
This RELEASE_MEM use has the Release semantic, which means we should write<br>
back but not invalidate. Invalidations only make sense with the Acquire<br>
semantic (ACQUIRE_MEM), or when RELEASE_MEM is used to do the combined<br>
Acquire-Release semantic, which is a barrier, not a fence.<br>
<br>
The undesirable side effect of doing invalidations for the Release semantic<br>
is that it invalidates caches while shaders are running, because the Release<br>
can execute in the middle of the next IB.<br>
<br>
UMDs should use ACQUIRE_MEM at the beginning of IBs. Doing cache<br>
invalidations for a fence (like in this case) doesn't do anything<br>
for correctness.<br>
<br>
Signed-off-by: Marek Olšák <<a href="mailto:marek.olsak@amd.com" target="_blank">marek.olsak@amd.com</a>><br>
---<br>
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 6 +-----<br>
 1 file changed, 1 insertion(+), 5 deletions(-)<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c<br>
index 210d24511dc6..a30f5d4913b9 100644<br>
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c<br>
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c<br>
@@ -4296,25 +4296,21 @@ static void gfx_v10_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr,<br>
        bool int_sel = flags & AMDGPU_FENCE_FLAG_INT;<br>
<br>
        /* Interrupt not work fine on GFX10.1 model yet. Use fallback instead */<br>
        if (adev->pdev->device == 0x50)<br>
                int_sel = false;<br>
<br>
        /* RELEASE_MEM - flush caches, send int */<br>
        amdgpu_ring_write(ring, PACKET3(PACKET3_RELEASE_MEM, 6));<br>
        amdgpu_ring_write(ring, (PACKET3_RELEASE_MEM_GCR_SEQ |<br>
                                 PACKET3_RELEASE_MEM_GCR_GL2_WB |<br>
-                                PACKET3_RELEASE_MEM_GCR_GL2_INV |<br>
-                                PACKET3_RELEASE_MEM_GCR_GL2_US |<br>
-                                PACKET3_RELEASE_MEM_GCR_GL1_INV |<br>
-                                PACKET3_RELEASE_MEM_GCR_GLV_INV |<br>
-                                PACKET3_RELEASE_MEM_GCR_GLM_INV |<br>
+                                PACKET3_RELEASE_MEM_GCR_GLM_INV | /* must be set with GLM_WB */<br>
                                 PACKET3_RELEASE_MEM_GCR_GLM_WB |<br>
                                 PACKET3_RELEASE_MEM_CACHE_POLICY(3) |<br>
                                 PACKET3_RELEASE_MEM_EVENT_TYPE(CACHE_FLUSH_AND_INV_TS_EVENT) |<br>
                                 PACKET3_RELEASE_MEM_EVENT_INDEX(5)));<br>
        amdgpu_ring_write(ring, (PACKET3_RELEASE_MEM_DATA_SEL(write64bit ? 2 : 1) |<br>
                                 PACKET3_RELEASE_MEM_INT_SEL(int_sel ? 2 : 0)));<br>
<br>
        /*<br>
         * the address should be Qword aligned if 64bit write, Dword<br>
         * aligned if only send 32bit data low (discard data high)<br>
-- <br>
2.17.1<br>
<br>
</blockquote>
</div>
</div>
</body>
</html>