[PATCH] drm/amdgpu: avoid NULL pointer dereference

Chen, Guchun Guchun.Chen at amd.com
Tue Dec 28 05:15:46 UTC 2021


[Public]

Hello Hawking,

Any comment to this patch?

Regards,
Guchun

-----Original Message-----
From: Chen, Guchun <Guchun.Chen at amd.com> 
Sent: Wednesday, December 22, 2021 10:20 PM
To: amd-gfx at lists.freedesktop.org; Zhang, Hawking <Hawking.Zhang at amd.com>; Zhou1, Tao <Tao.Zhou1 at amd.com>; Clements, John <John.Clements at amd.com>; Lazar, Lijo <Lijo.Lazar at amd.com>
Cc: Chen, Guchun <Guchun.Chen at amd.com>
Subject: [PATCH] drm/amdgpu: avoid NULL pointer dereference

amdgpu_umc_poison_handler for UMC RAS consumption gets called in KFD queue reset, but it needs to return early when RAS context is NULL. This can guarantee lower access to RAS context like in amdgpu_umc_do_page_retirement. Also improve coding style in amdgpu_umc_poison_handler.

Signed-off-by: Guchun Chen <guchun.chen at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
index 46264a4002f7..b455fc7d1546 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
@@ -112,16 +112,20 @@ int amdgpu_umc_poison_handler(struct amdgpu_device *adev,
 		void *ras_error_status,
 		bool reset)
 {
-	int ret;
 	struct ras_err_data *err_data = (struct ras_err_data *)ras_error_status;
 	struct ras_common_if head = {
 		.block = AMDGPU_RAS_BLOCK__UMC,
 	};
-	struct ras_manager *obj = amdgpu_ras_find_obj(adev, &head);
+	struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
+	struct ras_manager *obj;
+	int ret;
+
+	if (!con)
+		return 0;
 
-	ret =
-		amdgpu_umc_do_page_retirement(adev, ras_error_status, NULL, reset);
+	ret = amdgpu_umc_do_page_retirement(adev, ras_error_status, NULL, 
+reset);
 
+	obj = amdgpu_ras_find_obj(adev, &head);
 	if (ret == AMDGPU_RAS_SUCCESS && obj) {
 		obj->err_data.ue_count += err_data->ue_count;
 		obj->err_data.ce_count += err_data->ce_count;
--
2.17.1


More information about the amd-gfx mailing list