[PATCH 36/44] drm/amdkfd: Fix spurious restore failures

Felix Kuehling Felix.Kuehling at amd.com
Mon Mar 22 10:58:52 UTC 2021


Restore can appear to fail if the svms->evicted counter changes before
the function can acquire the necessary locks. Re-read the counter after
acquiring the lock to minimize the chances of having to reschedule the
worker.

Change-Id: I236b912bddf106583be264abde2f6bd1a5d5a083
Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 0fbc037b06e3..49aca4664411 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1429,6 +1429,8 @@ static void svm_range_restore_work(struct work_struct *work)
 	svm_range_list_lock_and_flush_work(svms, mm);
 	mutex_lock(&svms->lock);
 
+	evicted_ranges = atomic_read(&svms->evicted_ranges);
+
 	list_for_each_entry(prange, &svms->list, list) {
 		invalid = atomic_read(&prange->invalid);
 		if (!invalid)
-- 
2.31.0



More information about the amd-gfx mailing list