[PATCH] drm/amdgpu: set default noretry=1 to fix kfd SVM issues for raven

Changfeng Changfeng.Zhu at amd.com
Wed Jul 28 06:36:13 UTC 2021


From: changzhu <Changfeng.Zhu at amd.com>

From: Changfeng <Changfeng.Zhu at amd.com>

It can't find any issues with noretry=1 except two SVM migrate issues.
Oppositely, it will cause most SVM cases fail with noretry=0.
The two SVM migrate issues also happen with noretry=0. So it can set
default noretry=1 for raven firstly to fix most SVM fails.

Change-Id: Idb5cb3c1a04104013e4ab8aed2ad4751aaec4bbc
Signed-off-by: Changfeng <Changfeng.Zhu at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 09edfb64cce0..d7f69dbd48e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -606,19 +606,20 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev)
 		 * noretry = 0 will cause kfd page fault tests fail
 		 * for some ASICs, so set default to 1 for these ASICs.
 		 */
+	case CHIP_RAVEN:
+		/*
+		 * TODO: Raven currently can fix most SVM issues with
+		 * noretry =1. However it has two issues with noretry = 1
+		 * on kfd migrate tests. It still needs to root causes
+		 * with these two migrate fails on raven with noretry = 1.
+		 */
 		if (amdgpu_noretry == -1)
 			gmc->noretry = 1;
 		else
 			gmc->noretry = amdgpu_noretry;
 		break;
-	case CHIP_RAVEN:
 	default:
-		/* Raven currently has issues with noretry
-		 * regardless of what we decide for other
-		 * asics, we should leave raven with
-		 * noretry = 0 until we root cause the
-		 * issues.
-		 *
+		/*
 		 * default this to 0 for now, but we may want
 		 * to change this in the future for certain
 		 * GPUs as it can increase performance in
-- 
2.17.1



More information about the amd-gfx mailing list