[PATCH] drm/amdgpu: return error when eeprom checksum failed

Zhang, Hawking Hawking.Zhang at amd.com
Mon Dec 2 06:14:12 UTC 2024


[AMD Official Use Only - AMD Internal Distribution Only]

Ah, hold on please. I assume even the BADG is written to headers. There are still valid eeprom record available in the eeprom, right?

Regards,
Hawking

-----Original Message-----
From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Zhang, Hawking
Sent: Monday, December 2, 2024 1:54 PM
To: Su, Joe <Jinzhou.Su at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Yang, Stanley <Stanley.Yang at amd.com>
Subject: RE: [PATCH] drm/amdgpu: return error when eeprom checksum failed

[AMD Official Use Only - AMD Internal Distribution Only]

[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Hawking Zhang <Hawking.Zhang at amd.com>

Regards,
Hawking
-----Original Message-----
From: Su, Joe <Jinzhou.Su at amd.com>
Sent: Monday, December 2, 2024 13:30
To: amd-gfx at lists.freedesktop.org
Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Yang, Stanley <Stanley.Yang at amd.com>; Su, Joe <Jinzhou.Su at amd.com>
Subject: [PATCH] drm/amdgpu: return error when eeprom checksum failed

Return eeprom table checksum error result, otherwise it might be overwritten by next call.

V2: replace DRM_ERROR with dev_err

Signed-off-by: Jinzhou Su <jinzhou.su at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index f4a9e15389ae..bd8acb55f76f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -1412,9 +1412,11 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control)
                }

                res = __verify_ras_table_checksum(control);
-               if (res)
-                       DRM_ERROR("RAS Table incorrect checksum or error:%d\n",
+               if (res) {
+                       dev_err(adev->dev, "RAS Table incorrect checksum or error:%d\n",
                                  res);
+                       return -EINVAL;
+               }
                if (ras->bad_page_cnt_threshold > control->ras_num_recs) {
                        /* This means that, the threshold was increased since
                         * the last time the system was booted, and now,
--
2.43.0



More information about the amd-gfx mailing list