[PATCH] drm/amd/amdgpu: set the default value of noretry to 1 for some dGPUs

Gui, Jack Jack.Gui at amd.com
Tue Oct 13 07:51:14 UTC 2020


[AMD Public Use]

Hi Guchun,

It's fine.
I will address the patch according to your suggestion.

BR,
Jack

-----Original Message-----
From: Chen, Guchun <Guchun.Chen at amd.com> 
Sent: Tuesday, October 13, 2020 3:45 PM
To: Gui, Jack <Jack.Gui at amd.com>; amd-gfx at lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher at amd.com>
Cc: Gui, Jack <Jack.Gui at amd.com>; Zhou1, Tao <Tao.Zhou1 at amd.com>; Huang, Ray <Ray.Huang at amd.com>; Zhang, Hawking <Hawking.Zhang at amd.com>; Kuehling, Felix <Felix.Kuehling at amd.com>
Subject: RE: [PATCH] drm/amd/amdgpu: set the default value of noretry to 1 for some dGPUs

[AMD Public Use]

Hi Jack,

How about improving the patch a bit like below? As the code for raven and default case is totally the same, maybe we could squash both together with adding the comment for RAVEN and other default ASICs for readiness.

switch (adev->asic_type) {
	case CHIP_VEGA20:
	case CHIP_NAVI10:
	case CHIP_NAVI14:
	case CHIP_ARCTURUS:
		/*
		 * noretry = 0 will cause kfd page fault tests fail
		 * for some ASICs, so set default to 1 for these ASICs.
		 */
		if (amdgpu_noretry == -1)
			gmc->noretry = 1;
		else
			gmc->noretry = amdgpu_noretry;
		break;
	case CHIP_RAVEN: 
	default:
	/* Raven currently has issues with noretry
	 * regardless of what we decide for other
	 * asics, we should leave raven with
	 * noretry = 0 until we root cause the
	 * issues. The same for other default ASICs.
	 */
		if (amdgpu_noretry == -1)
			gmc->noretry = 0;
		else
			gmc->noretry = amdgpu_noretry;
		break;
	
Regards,
Guchun

-----Original Message-----
From: Chengming Gui <Jack.Gui at amd.com> 
Sent: Tuesday, October 13, 2020 12:35 PM
To: amd-gfx at lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher at amd.com>
Cc: Gui, Jack <Jack.Gui at amd.com>; Zhou1, Tao <Tao.Zhou1 at amd.com>; Rui.Huang at amd.com; Chen, Guchun <Guchun.Chen at amd.com>; Zhang, Hawking <Hawking.Zhang at amd.com>; Kuehling, Felix <Felix.Kuehling at amd.com>
Subject: [PATCH] drm/amd/amdgpu: set the default value of noretry to 1 for some dGPUs

noretry = 0 cause some dGPU's kfd page fault tests fail, so set noretry to 1 for these special ASICs:
vega20/navi10/navi14/ARCTURUS

Signed-off-by: Chengming Gui <Jack.Gui at amd.com>
Change-Id: I3be70f463a49b0cd5c56456431d6c2cb98b13872
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 36604d751d62..f317bdeffcb1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -437,6 +437,19 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev)
 		else
 			gmc->noretry = amdgpu_noretry;
 		break;
+	case CHIP_VEGA20:
+	case CHIP_NAVI10:
+	case CHIP_NAVI14:
+	case CHIP_ARCTURUS:
+		/*
+		 * noretry = 0 will cause kfd page fault tests fail
+		 * for some ASICs, so set default to 1 for these ASICs.
+		 */
+		if (amdgpu_noretry == -1)
+			gmc->noretry = 1;
+		else
+			gmc->noretry = amdgpu_noretry;
+		break;
 	default:
 		/* default this to 0 for now, but we may want
 		 * to change this in the future for certain
--
2.17.1


More information about the amd-gfx mailing list