[PATCH] drm/amdgpu: optimize ACA log print
Wang, Yang(Kevin)
KevinYang.Wang at amd.com
Fri Oct 25 09:35:36 UTC 2024
[AMD Official Use Only - AMD Internal Distribution Only]
Fix typo, DE -> UE.
Best Regards,
Kevin
-----Original Message-----
From: Wang, Yang(Kevin)
Sent: Friday, October 25, 2024 5:20 PM
To: Lazar, Lijo <Lijo.Lazar at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Zhou1, Tao <Tao.Zhou1 at amd.com>
Subject: RE: [PATCH] drm/amdgpu: optimize ACA log print
-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar at amd.com>
Sent: Friday, October 25, 2024 3:25 PM
To: Wang, Yang(Kevin) <KevinYang.Wang at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Zhang, Hawking <Hawking.Zhang at amd.com>; Zhou1, Tao <Tao.Zhou1 at amd.com>
Subject: Re: [PATCH] drm/amdgpu: optimize ACA log print
On 10/25/2024 12:49 PM, Yang Wang wrote:
> - skip to print CE ACA log.
> - optimize ACA log print for MCA.
>
> Signed-off-by: Yang Wang <kevinyang.wang at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c | 21 ++++++++++++++++++++-
> 1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> index 18ee60378727..3ca03b5e0f91 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c
> @@ -348,6 +348,24 @@ static bool amdgpu_mca_bank_should_update(struct amdgpu_device *adev, enum amdgp
> return ret;
> }
>
> +static bool amdgpu_mca_bank_should_dump(struct amdgpu_device *adev, enum amdgpu_mca_error_type type,
> + struct mca_bank_entry *entry)
> +{
> + bool ret;
> +
> + switch (type) {
> + case AMDGPU_MCA_ERROR_TYPE_CE:
> + ret = amdgpu_mca_is_deferred_error(adev,
> +entry->regs[MCA_REG_IDX_STATUS]);
AFAIK, deferred errors are not correctable. Shouldn't it be checked against AMDGPU_MCA_ERROR_TYPE_DE?
Thanks,
Lijo
[kevin]:
In this case, the type is used to indicate the SMU bank channel, only CE/UE bank channel is available in SMU side.
Best Regards,
Kevin
> + break;
> + case AMDGPU_MCA_ERROR_TYPE_UE:
> + default:
> + ret = true;
> + break;
> + }
> +
> + return ret;
> +}
> +
> static int amdgpu_mca_smu_get_mca_set(struct amdgpu_device *adev, enum amdgpu_mca_error_type type, struct mca_bank_set *mca_set,
> struct ras_query_context *qctx) { @@ -373,7 +391,8 @@
> static int amdgpu_mca_smu_get_mca_set(struct amdgpu_device *adev, enum
> amdgpu_mc
>
> amdgpu_mca_bank_set_add_entry(mca_set, &entry);
>
> - amdgpu_mca_smu_mca_bank_dump(adev, i, &entry, qctx);
> + if (amdgpu_mca_bank_should_dump(adev, type, &entry))
> + amdgpu_mca_smu_mca_bank_dump(adev, i, &entry, qctx);
> }
>
> return 0;
More information about the amd-gfx
mailing list