[PATCH 1/2] drm/amdgpu: fix find ras error node error

Yang Wang kevinyang.wang at amd.com
Fri Oct 20 07:00:25 UTC 2023


the origin function might return the wrong node.

Fixes: d479ef0d5fbd ("drm/amdgpu: add ras_err_info to identify RAS error source")

Signed-off-by: Yang Wang <kevinyang.wang at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 70dd249f2ba7..d5bcfcf4ced2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -3518,11 +3518,10 @@ static struct ras_err_node *amdgpu_ras_error_find_node_by_id(struct ras_err_data
 
 	for_each_ras_error(err_node, err_data) {
 		ref_id = &err_node->err_info.mcm_info;
-		if ((mcm_info->socket_id >= 0 && mcm_info->socket_id != ref_id->socket_id) ||
-		    (mcm_info->die_id >= 0 && mcm_info->die_id != ref_id->die_id))
-			continue;
 
-		return err_node;
+		if (mcm_info->socket_id == ref_id->socket_id &&
+		    mcm_info->die_id == ref_id->die_id)
+			return err_node;
 	}
 
 	return NULL;
-- 
2.34.1



More information about the amd-gfx mailing list