[PATCH 36/44] drm/amdkfd: Fix spurious restore failures
Felix Kuehling
Felix.Kuehling at amd.com
Mon Mar 22 10:58:52 UTC 2021
Restore can appear to fail if the svms->evicted counter changes before
the function can acquire the necessary locks. Re-read the counter after
acquiring the lock to minimize the chances of having to reschedule the
worker.
Change-Id: I236b912bddf106583be264abde2f6bd1a5d5a083
Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 0fbc037b06e3..49aca4664411 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1429,6 +1429,8 @@ static void svm_range_restore_work(struct work_struct *work)
svm_range_list_lock_and_flush_work(svms, mm);
mutex_lock(&svms->lock);
+ evicted_ranges = atomic_read(&svms->evicted_ranges);
+
list_for_each_entry(prange, &svms->list, list) {
invalid = atomic_read(&prange->invalid);
if (!invalid)
--
2.31.0
More information about the amd-gfx
mailing list