[PATCH] drm/amdgpu: Switch to aca bank for xgmi pcs err cnt

Yang, Stanley Stanley.Yang at amd.com
Tue Dec 12 14:05:42 UTC 2023


[AMD Official Use Only - General]

Reviewed-by: Stanley.Yang <Stanley.Yang at amd.com>

Regards,
Stanley
> -----Original Message-----
> From: Zhang, Hawking <Hawking.Zhang at amd.com>
> Sent: Tuesday, December 12, 2023 10:03 PM
> To: amd-gfx at lists.freedesktop.org; Yang, Stanley <Stanley.Yang at amd.com>;
> Wang, Yang(Kevin) <KevinYang.Wang at amd.com>
> Cc: Zhang, Hawking <Hawking.Zhang at amd.com>
> Subject: [PATCH] drm/amdgpu: Switch to aca bank for xgmi pcs err cnt
>
> Instead of software managed counters.
>
> Signed-off-by: Hawking Zhang <Hawking.Zhang at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h              | 2 ++
>  drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 6 ++++--
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> index e51e8918e667..b399f1b62887 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h
> @@ -46,6 +46,8 @@
>  #define MCA_REG__STATUS__ERRORCODEEXT(x)     MCA_REG_FIELD(x,
> 21, 16)
>  #define MCA_REG__STATUS__ERRORCODE(x)                MCA_REG_FIELD(x,
> 15, 0)
>
> +#define MCA_REG__MISC0__ERRCNT(x)            MCA_REG_FIELD(x,
> 43, 32)
> +
>  #define MCA_REG__SYND__ERRORINFORMATION(x)   MCA_REG_FIELD(x,
> 17, 0)
>
>  enum amdgpu_mca_ip {
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> index ddd782fbee7a..3998c9b31d07 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
> @@ -2537,13 +2537,15 @@ static int
> mca_pcs_xgmi_mca_get_err_count(const struct mca_ras_info *mca_ras, st
>                                         uint32_t *count)
>  {
>       u32 ext_error_code;
> +     u32 err_cnt;
>
>       ext_error_code = MCA_REG__STATUS__ERRORCODEEXT(entry-
> >regs[MCA_REG_IDX_STATUS]);
> +     err_cnt = MCA_REG__MISC0__ERRCNT(entry-
> >regs[MCA_REG_IDX_MISC0]);
>
>       if (type == AMDGPU_MCA_ERROR_TYPE_UE && ext_error_code == 0)
> -             *count = 1;
> +             *count = err_cnt;
>       else if (type == AMDGPU_MCA_ERROR_TYPE_CE && ext_error_code
> == 6)
> -             *count = 1;
> +             *count = err_cnt;
>
>       return 0;
>  }
> --
> 2.17.1



More information about the amd-gfx mailing list