[PATCH 1/2] drm/amdgpu: prevent memory wipe in suspend/shutdown stage

Chen, Guchun Guchun.Chen at amd.com
Tue Mar 15 07:49:18 UTC 2022


I used two tabs in VIM. Let me update this later.

Regards,
Guchun

-----Original Message-----
From: Koenig, Christian <Christian.Koenig at amd.com> 
Sent: Tuesday, March 15, 2022 3:35 PM
To: Chen, Guchun <Guchun.Chen at amd.com>; amd-gfx at lists.freedesktop.org; Zhang, Hawking <Hawking.Zhang at amd.com>; Pan, Xinhui <Xinhui.Pan at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>
Subject: Re: [PATCH 1/2] drm/amdgpu: prevent memory wipe in suspend/shutdown stage



Am 15.03.22 um 08:09 schrieb Guchun Chen:
> On GPUs with RAS enabled, below call trace is observed when suspending 
> or shutting down device. The cause is we have enabled memory wipe flag 
> for BOs on such GPUs by default, and such BOs will go to memory wipe 
> by amdgpu_fill_buffer, however, because ring is off already, it fails 
> to clean up the memory and throw this error message. So add a 
> suspend/shutdown check before wipping memory.
>
> [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>
> Fixes: e7e7c87a205d("drm/amdgpu: Wipe all VRAM on free when RAS is 
> enabled")
> Signed-off-by: Guchun Chen <guchun.chen at amd.com>

Just one nit below, but the patch is anyway Reviewed-by: Christian König <christian.koenig at amd.com>.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 23c9a60693ee..ed1a19be4a54 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1284,6 +1284,7 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo, uint64_t *vram_mem,
>    */
>   void amdgpu_bo_release_notify(struct ttm_buffer_object *bo)
>   {
> +	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
>   	struct dma_fence *fence = NULL;
>   	struct amdgpu_bo *abo;
>   	int r;
> @@ -1303,7 +1304,8 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo)
>   		amdgpu_amdkfd_remove_fence_on_pt_pd_bos(abo);
>   
>   	if (bo->resource->mem_type != TTM_PL_VRAM ||
> -	    !(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
> +		!(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE) ||
> +		adev->in_suspend || adev->shutdown)

What editor and settings are you using?

When you have a multi-line condition to an if the next line should start after the ( of the previous line, but this here is using two tabs instead.

Regards,
Christian.

>   		return;
>   
>   	if (WARN_ON_ONCE(!dma_resv_trylock(bo->base.resv)))



More information about the amd-gfx mailing list