[PATCH] drm/amdgpu: replace DRM_ERROR with DRM_WARN in ras_reserve_bad_pages
Chen, Guchun
Guchun.Chen at amd.com
Tue Sep 17 07:04:32 UTC 2019
Yeah, that's fine.
Reviewed-by: Guchun Chen <guchun.chen at amd.com>
-----Original Message-----
From: Zhou1, Tao <Tao.Zhou1 at amd.com>
Sent: Tuesday, September 17, 2019 3:01 PM
To: Chen, Guchun <Guchun.Chen at amd.com>; amd-gfx at lists.freedesktop.org; Zhang, Hawking <Hawking.Zhang at amd.com>
Subject: RE: [PATCH] drm/amdgpu: replace DRM_ERROR with DRM_WARN in ras_reserve_bad_pages
> -----Original Message-----
> From: Chen, Guchun <Guchun.Chen at amd.com>
> Sent: 2019年9月17日 14:52
> To: Zhou1, Tao <Tao.Zhou1 at amd.com>; amd-gfx at lists.freedesktop.org;
> Zhang, Hawking <Hawking.Zhang at amd.com>
> Subject: RE: [PATCH] drm/amdgpu: replace DRM_ERROR with DRM_WARN in
> ras_reserve_bad_pages
>
>
>
> -----Original Message-----
> From: Zhou1, Tao <Tao.Zhou1 at amd.com>
> Sent: Tuesday, September 17, 2019 2:25 PM
> To: amd-gfx at lists.freedesktop.org; Chen, Guchun <Guchun.Chen at amd.com>;
> Zhang, Hawking <Hawking.Zhang at amd.com>
> Cc: Zhou1, Tao <Tao.Zhou1 at amd.com>
> Subject: [PATCH] drm/amdgpu: replace DRM_ERROR with DRM_WARN in
> ras_reserve_bad_pages
>
> There are two cases of reserve error should be ignored:
> 1) a ras bad page has been allocated (used by someone);
> 2) a ras bad page has been reserved (duplicate error injection for one
> page);
>
> DRM_ERROR is unnecessary for the failure of bad page reserve
>
> Signed-off-by: Tao Zhou <tao.zhou1 at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 79e5e5be8b34..5f623daf5078 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1409,10 +1409,15 @@ int amdgpu_ras_reserve_bad_pages(struct
> amdgpu_device *adev)
> for (i = data->last_reserved; i < data->count; i++) {
> bp = data->bps[i].retired_page;
>
> + /* There are two cases of reserve error should be ignored:
> + * 1) a ras bad page has been allocated (used by someone);
> + * 2) a ras bad page has been reserved (duplicate error
> injection
> + * for one page);
> + */
> if (amdgpu_bo_create_kernel_at(adev, bp << PAGE_SHIFT, PAGE_SIZE,
> AMDGPU_GEM_DOMAIN_VRAM,
> &bo, NULL))
> [Guchun]Do we need to change PAGE_SHIFT to AMDGPU_GPU_PAGE_SHIFT here?
[Tao] Alex has another patch to fix it, you can find it in mail list.
>
> - DRM_ERROR("RAS ERROR: reserve vram %llx fail\n",
> bp);
> + DRM_WARN("RAS WARN: reserve vram for retired
> page %llx fail\n", bp);
>
> data->bps_bo[i] = bo;
> data->last_reserved = i + 1;
> --
> 2.17.1
More information about the amd-gfx
mailing list