[PATCH] drm/amdgpu: update ras sysfs feature info

Zhang, Hawking Hawking.Zhang at amd.com
Thu Aug 8 13:50:45 UTC 2019


Understood and agree we should keep stable interfaces. 

But the information in feature node is not correct and makes people confusing. Basically, each IP blocks can support all the four error types, not just multi-uncorrectable. As a result, any upper level apps/libs that read from this file will just get confusing information as well. The feature mask is already good enough for this node.

Regards,
Hawking
-----Original Message-----
From: Russell, Kent <Kent.Russell at amd.com> 
Sent: 2019年8月8日 20:51
To: Zhang, Hawking <Hawking.Zhang at amd.com>; Zhou1, Tao <Tao.Zhou1 at amd.com>; amd-gfx at lists.freedesktop.org; Pan, Xinhui <Xinhui.Pan at amd.com>; Freehill, Chris <Chris.Freehill at amd.com>
Cc: Zhou1, Tao <Tao.Zhou1 at amd.com>
Subject: RE: [PATCH] drm/amdgpu: update ras sysfs feature info

+Chris Freehill

While I can understand this change, this broke our SMI interface, which was expecting a specific string format for the ras/features file. This has happened a few times now, where changes to the RAS sysfs files has broke the SMI CLI and/or SMI LIB. Can we please get a stable interface and sysfs format set up before publishing patches? This is creating a lot of extra work for developers with the SMI to constantly keep up with the changes being made to sysfs files. Thank you.

 Kent

-----Original Message-----
From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Zhang, Hawking
Sent: Monday, August 5, 2019 4:15 AM
To: Zhou1, Tao <Tao.Zhou1 at amd.com>; amd-gfx at lists.freedesktop.org; Pan, Xinhui <Xinhui.Pan at amd.com>
Cc: Zhou1, Tao <Tao.Zhou1 at amd.com>
Subject: RE: [PATCH] drm/amdgpu: update ras sysfs feature info

Reviewed-by: Hawking Zhang <Hawking.Zhang at amd.com>

Regards,
Hawking
-----Original Message-----
From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Tao Zhou
Sent: 2019年8月5日 16:04
To: amd-gfx at lists.freedesktop.org; Pan, Xinhui <Xinhui.Pan at amd.com>; Zhang, Hawking <Hawking.Zhang at amd.com>
Cc: Zhou1, Tao <Tao.Zhou1 at amd.com>
Subject: [PATCH] drm/amdgpu: update ras sysfs feature info

remove confused ras error type info

Signed-off-by: Tao Zhou <tao.zhou1 at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index d2e8a85f6e38..369651247b23 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -787,25 +787,18 @@ static ssize_t amdgpu_ras_sysfs_features_read(struct device *dev,
 	struct amdgpu_device *adev = ddev->dev_private;
 	struct ras_common_if head;
 	int ras_block_count = AMDGPU_RAS_BLOCK_COUNT;
-	int i;
+	int i, enabled;
 	ssize_t s;
-	struct ras_manager *obj;
 
 	s = scnprintf(buf, PAGE_SIZE, "feature mask: 0x%x\n", con->features);
 
 	for (i = 0; i < ras_block_count; i++) {
 		head.block = i;
+		enabled = amdgpu_ras_is_feature_enabled(adev, &head);
 
-		if (amdgpu_ras_is_feature_enabled(adev, &head)) {
-			obj = amdgpu_ras_find_obj(adev, &head);
-			s += scnprintf(&buf[s], PAGE_SIZE - s,
-					"%s: %s\n",
-					ras_block_str(i),
-					ras_err_str(obj->head.type));
-		} else
-			s += scnprintf(&buf[s], PAGE_SIZE - s,
-					"%s: disabled\n",
-					ras_block_str(i));
+		s += scnprintf(&buf[s], PAGE_SIZE - s,
+				"%s ras feature mask: %s\n",
+				ras_block_str(i), enabled?"on":"off");
 	}
 
 	return s;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx at lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx at lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


More information about the amd-gfx mailing list