[PATCH] drm/amdgpu: check if driver should try recovery in ras recovery path

Zhang, Hawking Hawking.Zhang at amd.com
Thu Jan 16 04:47:51 UTC 2020


[AMD Public Use]

Good point. Please check my latest series.

Regards,
Hawking

-----Original Message-----
From: Chen, Guchun <Guchun.Chen at amd.com> 
Sent: Wednesday, January 15, 2020 15:51
To: Zhang, Hawking <Hawking.Zhang at amd.com>; amd-gfx at lists.freedesktop.org; Clements, John <John.Clements at amd.com>
Cc: Zhang, Hawking <Hawking.Zhang at amd.com>
Subject: RE: [PATCH] drm/amdgpu: check if driver should try recovery in ras recovery path

[AMD Public Use]

Seems it's better to sperate it into two patches, considering the patch purpose. One patch is to add user recovery capability by module parameter for Arcturus chip, and another is to check if driver should try recovery in RAS function, applying to all supported asics who enable RAS.

-----Original Message-----
From: Hawking Zhang <Hawking.Zhang at amd.com> 
Sent: Wednesday, January 15, 2020 12:21 PM
To: amd-gfx at lists.freedesktop.org; Chen, Guchun <Guchun.Chen at amd.com>; Clements, John <John.Clements at amd.com>
Cc: Zhang, Hawking <Hawking.Zhang at amd.com>
Subject: [PATCH] drm/amdgpu: check if driver should try recovery in ras recovery path

To allow the flexibilty for user to disable gpu recovery in RAS recovery path by module parameter amdgpu_gpu_recovery

Signed-off-by: Hawking Zhang <Hawking.Zhang at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c    | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 706a30e81fcc..8e2f0a380461 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3799,6 +3799,7 @@ bool amdgpu_device_should_recover_gpu(struct amdgpu_device *adev)
 		case CHIP_VEGA10:
 		case CHIP_VEGA12:
 		case CHIP_RAVEN:
+		case CHIP_ARCTURUS:
 			break;
 		default:
 			goto disabled;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index ac9926b3f9fe..492b3ba685cc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1356,7 +1356,8 @@ static void amdgpu_ras_do_recovery(struct work_struct *work)
 	struct amdgpu_ras *ras =
 		container_of(work, struct amdgpu_ras, recovery_work);
 
-	amdgpu_device_gpu_recover(ras->adev, 0);
+	if (amdgpu_device_should_recover_gpu(ras->adev))
+		amdgpu_device_gpu_recover(ras->adev, 0);
 	atomic_set(&ras->in_recovery, 0);
 }
 
--
2.17.1


More information about the amd-gfx mailing list