[PATCH 03/23] drm/amdkfd: Workaround MEC mmhub flush issue

Alex Deucher alexander.deucher at amd.com
Thu Mar 30 19:42:14 UTC 2023


From: Philip Yang <Philip.Yang at amd.com>

MEC FW should flush TLB and cache when unmapping user queues, this
is not working correctly in master FW via HIQ, it affects SDMA queues
which use mmhub on AID, cause several KFDTest failure.

Workaround this in KFD for now. Will revert this patch to verify FW fix
later.

Signed-off-by: Philip Yang <Philip.Yang at amd.com>
Tested-by: David Francis <David.Francis at amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling at amd.com>
Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index ab91a0e211c8..1d53cbc55253 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1038,6 +1038,15 @@ static int evict_process_queues_cpsch(struct device_queue_manager *dqm,
 					      KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES :
 					      KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
 
+	/* Workaround MEC mmhub flush issue
+	 * explicit heavyweight TLB flush after all unmap_queues calls
+	 *
+	 * It would not help if the firmware is unmapping queues itself when the
+	 * runlist is oversubscribed.
+	 */
+	atomic64_set(&pdd->tlb_seq, 0);
+	kfd_flush_tlb(pdd, TLB_FLUSH_HEAVYWEIGHT);
+
 out:
 	dqm_unlock(dqm);
 	return retval;
-- 
2.39.2



More information about the amd-gfx mailing list