[PATCH 1/2] drm/amdgpu: prevent memory wipe in suspend/shutdown stage

Christian König christian.koenig at amd.com
Tue Mar 15 07:35:06 UTC 2022



Am 15.03.22 um 08:09 schrieb Guchun Chen:
> On GPUs with RAS enabled, below call trace is observed when
> suspending or shutting down device. The cause is we have enabled
> memory wipe flag for BOs on such GPUs by default, and such BOs
> will go to memory wipe by amdgpu_fill_buffer, however, because
> ring is off already, it fails to clean up the memory and throw
> this error message. So add a suspend/shutdown check before
> wipping memory.
>
> [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.
>
> Fixes: e7e7c87a205d("drm/amdgpu: Wipe all VRAM on free when RAS is enabled")
> Signed-off-by: Guchun Chen <guchun.chen at amd.com>

Just one nit below, but the patch is anyway Reviewed-by: Christian König 
<christian.koenig at amd.com>.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 23c9a60693ee..ed1a19be4a54 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1284,6 +1284,7 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo, uint64_t *vram_mem,
>    */
>   void amdgpu_bo_release_notify(struct ttm_buffer_object *bo)
>   {
> +	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
>   	struct dma_fence *fence = NULL;
>   	struct amdgpu_bo *abo;
>   	int r;
> @@ -1303,7 +1304,8 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo)
>   		amdgpu_amdkfd_remove_fence_on_pt_pd_bos(abo);
>   
>   	if (bo->resource->mem_type != TTM_PL_VRAM ||
> -	    !(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
> +		!(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE) ||
> +		adev->in_suspend || adev->shutdown)

What editor and settings are you using?

When you have a multi-line condition to an if the next line should start 
after the ( of the previous line, but this here is using two tabs instead.

Regards,
Christian.

>   		return;
>   
>   	if (WARN_ON_ONCE(!dma_resv_trylock(bo->base.resv)))



More information about the amd-gfx mailing list