[PATCH 0/4] enable umc ras ce interrupt
Chen, Guchun
Guchun.Chen at amd.com
Thu Aug 1 08:21:51 UTC 2019
1) Patch 1, looks the return value of our callback always returns UE case, but I assume CE case should also be covered. Maybe it's another topic.
if (ret == AMDGPU_RAS_UE) {
+ /* these counts could be left as 0 if
+ * some blocks do not count error number
+ */
obj->err_data.ue_count += err_data.ue_count;
+ obj->err_data.ce_count += err_data.ce_count;
2) In Patch 2, one unused variable "ras_error_status" is there, do we need to remove it?
static void umc_v6_1_ras_init(struct amdgpu_device *adev) {
+ void *ras_error_status = NULL;
+ amdgpu_umc_for_each_channel(umc_v6_1_ras_init_per_channel);
}
Regards,
Guchun
-----Original Message-----
From: Zhang, Hawking <Hawking.Zhang at amd.com>
Sent: Thursday, August 1, 2019 3:52 PM
To: Zhou1, Tao <Tao.Zhou1 at amd.com>; amd-gfx at lists.freedesktop.org; Li, Dennis <Dennis.Li at amd.com>; Chen, Guchun <Guchun.Chen at amd.com>; Pan, Xinhui <Xinhui.Pan at amd.com>
Cc: Zhou1, Tao <Tao.Zhou1 at amd.com>
Subject: RE: [PATCH 0/4] enable umc ras ce interrupt
1.) Please fix the typo in patch #2 description: ec --> ce 2). Patch #2
+ ecc_err_cnt_sel = REG_SET_FIELD(ecc_err_cnt_sel, UMCCH0_0_EccErrCntSel,
+ EccErrInt, 0x1);
For the EccErrInt field, it should be programed to be (MAX - INIT), correct? but the hardcoded value seems not match with the value calculated by those macro.
Regards,
Hawking
-----Original Message-----
From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Tao Zhou
Sent: 2019年8月1日 14:54
To: amd-gfx at lists.freedesktop.org; Zhang, Hawking <Hawking.Zhang at amd.com>; Li, Dennis <Dennis.Li at amd.com>; Chen, Guchun <Guchun.Chen at amd.com>; Pan, Xinhui <Xinhui.Pan at amd.com>
Cc: Zhou1, Tao <Tao.Zhou1 at amd.com>
Subject: [PATCH 0/4] enable umc ras ce interrupt
These patches add support for umc ce interrupt, the interrupt is controlled by a error count threshold.
Tao Zhou (4):
drm/amdgpu: support ce interrupt in ras module
drm/amdgpu: implement umc ras init function
drm/amdgpu: update the calc algorithm of umc ecc error count
drm/amdgpu: only uncorrectable error needs gpu reset
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 12 ++++---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 6 +++-
drivers/gpu/drm/amd/amdgpu/umc_v6_1.c | 42 ++++++++++++++++++++++---
drivers/gpu/drm/amd/amdgpu/umc_v6_1.h | 7 +++++
4 files changed, 58 insertions(+), 9 deletions(-)
--
2.17.1
_______________________________________________
amd-gfx mailing list
amd-gfx at lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list