[PATCH] drm/amdgpu: refine usage of amdgpu_bad_page_threshold
Xie, Patrick
Gangliang.Xie at amd.com
Fri Jun 13 03:59:48 UTC 2025
[AMD Official Use Only - AMD Internal Distribution Only]
Sorry, it is a mistake, I will get rid of it.
From: Zhang, Hawking <Hawking.Zhang at amd.com>
Sent: Friday, June 13, 2025 11:52 AM
To: Xie, Patrick <Gangliang.Xie at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Zhou1, Tao <Tao.Zhou1 at amd.com>; Xie, Patrick <Gangliang.Xie at amd.com>
Subject: Re: [PATCH] drm/amdgpu: refine usage of amdgpu_bad_page_threshold
[AMD Official Use Only - AMD Internal Distribution Only]
if ((amdgpu_bad_page_threshold == -1) ||
- (amdgpu_bad_page_threshold == -2)) {
+ (amdgpu_bad_page_threshold == -2)) {
hmm.... Is it fixing code alignment?
Regards,
Hawking
From: Xie, Patrick <Gangliang.Xie at amd.com<mailto:Gangliang.Xie at amd.com>>
Date: Friday, June 13, 2025 at 11:07
To: amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org> <amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org>>
Cc: Zhang, Hawking <Hawking.Zhang at amd.com<mailto:Hawking.Zhang at amd.com>>, Zhou1, Tao <Tao.Zhou1 at amd.com<mailto:Tao.Zhou1 at amd.com>>, Xie, Patrick <Gangliang.Xie at amd.com<mailto:Gangliang.Xie at amd.com>>
Subject: [PATCH] drm/amdgpu: refine usage of amdgpu_bad_page_threshold
when amdgpu_bad_page_threshold == -1 or -2, driver will issue a warning
message when threshold is reached and continue runtime services.
Signed-off-by: ganglxie <ganglxie at amd.com<mailto:ganglxie at amd.com>>
---
.../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 21 +++++++++----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 2ddedf476542..a9246c53bde9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -763,18 +763,17 @@ amdgpu_ras_eeprom_update_header(struct amdgpu_ras_eeprom_control *control)
dev_warn(adev->dev,
"Saved bad pages %d reaches threshold value %d\n",
control->ras_num_bad_pages, ras->bad_page_cnt_threshold);
- control->tbl_hdr.header = RAS_TABLE_HDR_BAD;
- if (control->tbl_hdr.version >= RAS_TABLE_VER_V2_1) {
- control->tbl_rai.rma_status = GPU_RETIRED__ECC_REACH_THRESHOLD;
- control->tbl_rai.health_percent = 0;
- }
-
if ((amdgpu_bad_page_threshold != -1) &&
- (amdgpu_bad_page_threshold != -2))
+ (amdgpu_bad_page_threshold != -2)) {
+ control->tbl_hdr.header = RAS_TABLE_HDR_BAD;
+ if (control->tbl_hdr.version >= RAS_TABLE_VER_V2_1) {
+ control->tbl_rai.rma_status = GPU_RETIRED__ECC_REACH_THRESHOLD;
+ control->tbl_rai.health_percent = 0;
+ }
ras->is_rma = true;
-
- /* ignore the -ENOTSUPP return value */
- amdgpu_dpm_send_rma_reason(adev);
+ /* ignore the -ENOTSUPP return value */
+ amdgpu_dpm_send_rma_reason(adev);
+ }
}
if (control->tbl_hdr.version >= RAS_TABLE_VER_V2_1)
@@ -1509,7 +1508,7 @@ int amdgpu_ras_eeprom_check(struct amdgpu_ras_eeprom_control *control)
"RAS records:%d exceed threshold:%d\n",
control->ras_num_bad_pages, ras->bad_page_cnt_threshold);
if ((amdgpu_bad_page_threshold == -1) ||
- (amdgpu_bad_page_threshold == -2)) {
+ (amdgpu_bad_page_threshold == -2)) {
res = 0;
dev_warn(adev->dev,
"Please consult AMD Service Action Guide (SAG) for appropriate service procedures\n");
--
2.34.1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20250613/616bc816/attachment.htm>
More information about the amd-gfx
mailing list