[PATCH 2/2] drm/amdkfd: only allow heavy-weight TLB flush on some ASICs for SVM too
Lang Yu
Lang.Yu at amd.com
Fri Apr 15 08:04:12 UTC 2022
The idea is from commit a50fe7078035 ("drm/amdkfd: Only apply heavy-weight
TLB flush on Aldebaran") and commit f61c40c0757a ("drm/amdkfd: enable
heavy-weight TLB flush on Arcturus").
At the moment, heavy-weight TLB could cause problems on ASICs except
Aldebaran and Arcturus.
A simple hipMallocManaged/hipFree program could trigger this issue.
[ 97.787657] amdgpu 0000:01:00.0: amdgpu: wait for kiq fence error: 0.
[ 106.868758] amdgpu: qcm fence wait loop timeout expired
[ 106.868966] amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption
[ 106.869203] amdgpu: Failed to evict process queues
[ 106.869261] amdgpu: Failed to quiesce KFD
Signed-off-by: Lang Yu <Lang.Yu at amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 459fa07a3bcc..5afe216cf099 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1229,7 +1229,9 @@ svm_range_unmap_from_gpus(struct svm_range *prange, unsigned long start,
if (r)
break;
}
- kfd_flush_tlb(pdd, TLB_FLUSH_HEAVYWEIGHT);
+
+ if (kfd_flush_tlb_after_unmap(pdd->dev))
+ kfd_flush_tlb(pdd, TLB_FLUSH_HEAVYWEIGHT);
}
return r;
--
2.25.1
More information about the amd-gfx
mailing list