[PATCH v2] drm/amdgpu: Fix error handling in amdgpu_ras_add_bad_pages
Srinivasan Shanmugam
srinivasan.shanmugam at amd.com
Tue Dec 17 09:38:58 UTC 2024
It ensures that appropriate error codes are returned when an error
condition is detected
Fixes the below;
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2849 amdgpu_ras_add_bad_pages() warn: missing error code here? 'amdgpu_umc_pages_in_a_row()' failed.
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2884 amdgpu_ras_add_bad_pages() warn: missing error code here? 'amdgpu_ras_mca2pa()' failed.
Fixes: 9fe61c21405a ("drm/amdgpu: parse legacy RAS bad page mixed with new data in various NPS modes")
Reported-by: Dan Carpenter <dan.carpenter at linaro.org>
Cc: YiPeng Chai <yipeng.chai at amd.com>
Cc: Tao Zhou <tao.zhou1 at amd.com>
Cc: Hawking Zhang <Hawking.Zhang at amd.com>
Cc: Christian König <christian.koenig at amd.com>
Cc: Alex Deucher <alexander.deucher at amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam at amd.com>
---
v2:
- s/-EIO/-EINVAL, retained the use of -EINVAL from
amdgpu_umc_pages_in_a_row & and amdgpu_ras_mca2pa_by_idx, when the
RAS context is not initialized or the convert_ras_err_addr function is
unavailable. (Thomas)
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 01c947066a2e..f1371d1f8421 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2832,8 +2832,10 @@ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev,
mutex_lock(&con->recovery_lock);
data = con->eh_data;
- if (!data)
+ if (!data) {
+ ret = -EINVAL;
goto free;
+ }
for (i = 0; i < pages; i++) {
if (from_rom &&
@@ -2845,26 +2847,34 @@ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev,
* one row
*/
if (amdgpu_umc_pages_in_a_row(adev, &err_data,
- bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT))
+ bps[i].retired_page <<
+ AMDGPU_GPU_PAGE_SHIFT)) {
+ ret = -EINVAL;
goto free;
- else
+ } else {
find_pages_per_pa = true;
+ }
} else {
/* unsupported cases */
+ ret = -EOPNOTSUPP;
goto free;
}
}
} else {
if (amdgpu_umc_pages_in_a_row(adev, &err_data,
- bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT))
+ bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT)) {
+ ret = -EINVAL;
goto free;
+ }
}
} else {
if (from_rom && !find_pages_per_pa) {
if (bps[i].retired_page & UMC_CHANNEL_IDX_V2) {
/* bad page in any NPS mode in eeprom */
- if (amdgpu_ras_mca2pa_by_idx(adev, &bps[i], &err_data))
+ if (amdgpu_ras_mca2pa_by_idx(adev, &bps[i], &err_data)) {
+ ret = -EINVAL;
goto free;
+ }
} else {
/* legacy bad page in eeprom, generated only in
* NPS1 mode
@@ -2881,6 +2891,7 @@ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev,
/* non-nps1 mode, old RAS TA
* can't support it
*/
+ ret = -EOPNOTSUPP;
goto free;
}
}
--
2.34.1
More information about the amd-gfx
mailing list