[PATCH 2/2] drm/amdgpu: add debugfs node to toggle ras error cnt harvest
Li, Dennis
Dennis.Li at amd.com
Mon Aug 10 05:26:10 UTC 2020
[AMD Official Use Only - Internal Distribution Only]
Reviewed-by: DennisLi <Dennis.Li at amd.com>
Best Regards
Dennis Li
-----Original Message-----
From: Chen, Guchun <Guchun.Chen at amd.com>
Sent: Monday, August 10, 2020 1:23 PM
To: amd-gfx at lists.freedesktop.org; Zhang, Hawking <Hawking.Zhang at amd.com>; Li, Dennis <Dennis.Li at amd.com>; Lazar, Lijo <Lijo.Lazar at amd.com>; Zhou1, Tao <Tao.Zhou1 at amd.com>; Clements, John <John.Clements at amd.com>
Cc: Chen, Guchun <Guchun.Chen at amd.com>
Subject: [PATCH 2/2] drm/amdgpu: add debugfs node to toggle ras error cnt harvest
Before ras recovery is issued, user could operate this debugfs node to enable/disable the harvest of all RAS IPs' ras error count registers, which will help keep hardware's registers'
status instead of cleaning up them.
Signed-off-by: Guchun Chen <guchun.chen at amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index e6978b8e2143..31df6bf2dc1f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1215,6 +1215,13 @@ static void amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *adev)
*/
debugfs_create_bool("auto_reboot", S_IWUGO | S_IRUGO, con->dir,
&con->reboot);
+
+ /*
+ * User could set this not to clean up hardware's error count register
+ * of RAS IPs during ras recovery.
+ */
+ debugfs_create_bool("disable_ras_err_cnt_harvest", 0644,
+ con->dir, &con->disable_ras_err_cnt_harvest);
}
void amdgpu_ras_debugfs_create(struct amdgpu_device *adev,
--
2.17.1
More information about the amd-gfx
mailing list