<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 2024-10-18 14:28, Felix Kuehling
wrote:<br>
</div>
<blockquote type="cite" cite="mid:1a80e1fe-ef51-4ed4-a851-21d3c9da6c7f@amd.com">
<br>
On 2024-10-17 04:34, Victor Zhao wrote:
<br>
<blockquote type="cite">make sure KFD_FENCE_INIT write to
fence_addr before pm_send_query_status
<br>
called, to avoid qcm fence timeout caused by incorrect ordering.
<br>
<br>
Signed-off-by: Victor Zhao <a class="moz-txt-link-rfc2396E" href="mailto:Victor.Zhao@amd.com"><Victor.Zhao@amd.com></a>
<br>
---
<br>
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 1 +
<br>
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h | 2 +-
<br>
2 files changed, 2 insertions(+), 1 deletion(-)
<br>
<br>
diff --git
a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
<br>
index b2b16a812e73..d9264a353775 100644
<br>
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
<br>
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
<br>
@@ -2254,6 +2254,7 @@ static int unmap_queues_cpsch(struct
device_queue_manager *dqm,
<br>
goto out;
<br>
*dqm->fence_addr = KFD_FENCE_INIT;
<br>
+ mb();
<br>
pm_send_query_status(&dqm->packet_mgr,
dqm->fence_gpu_addr,
<br>
KFD_FENCE_COMPLETED);
<br>
/* should be timed out */
<br>
diff --git
a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
<br>
index 09ab36f8e8c6..bddb169bb301 100644
<br>
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
<br>
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
<br>
@@ -260,7 +260,7 @@ struct device_queue_manager {
<br>
uint16_t vmid_pasid[VMID_NUM];
<br>
uint64_t pipelines_addr;
<br>
uint64_t fence_gpu_addr;
<br>
- uint64_t *fence_addr;
<br>
+ volatile uint64_t *fence_addr;
<br>
</blockquote>
<br>
[+Christian]
<br>
<br>
Is the volatile keyword really needed here? I just saw other
patches removing volatile in some places because it's not
sufficient, and not needed if you use memory barriers correctly.
<br>
</blockquote>
<p>After reading kernel memory barrier document and below link, I
think we need both volatile type and memory barrier, to guarantee
F/W get the updated fence value. This fixes an CP hang issue on
SRIOV.<br>
</p>
<p><a class="moz-txt-link-freetext" href="https://stackoverflow.com/questions/75750110/volatile-vs-memory-barriers#:~:text=volatile%20will%20make%20sure%20that,not%20reorder%20writes%20or%20reads">https://stackoverflow.com/questions/75750110/volatile-vs-memory-barriers#:~:text=volatile%20will%20make%20sure%20that,not%20reorder%20writes%20or%20reads</a>.</p>
<p>Regards,</p>
<p>Philip<br>
</p>
<blockquote type="cite" cite="mid:1a80e1fe-ef51-4ed4-a851-21d3c9da6c7f@amd.com">
<br>
Regards,
<br>
Felix
<br>
<br>
<br>
<blockquote type="cite"> struct kfd_mem_obj *fence_mem;
<br>
bool active_runlist;
<br>
int sched_policy;
<br>
</blockquote>
</blockquote>
</body>
</html>