[PATCH 1/2] drm/amdgpu: fix build up and tear down of debug vram access bounce buffer

Chen, Guchun Guchun.Chen at amd.com
Tue Jan 18 08:09:19 UTC 2022


[Public]

Hi Christian,

Re: Well that doesn't seem to make sense the GART is initialized by the code around the allocation so that should work fine.

Below is the calltrace during driver probe. When binding the page(SDMA bo) into gart table, there is a check by gart.ready, that will be set to be true later on in gmc_v10_0_hw_init. So a calltrace is observed.

[    3.381510]  amdgpu_ttm_gart_bind+0x80/0xc0 [amdgpu]
[    3.381580]  amdgpu_ttm_alloc_gart+0x158/0x1c0 [amdgpu]
[    3.381647]  amdgpu_bo_create_reserved+0x136/0x1e0 [amdgpu]
[    3.381714]  ? amdgpu_ttm_debugfs_init+0x120/0x120 [amdgpu]
[    3.381782]  amdgpu_bo_create_kernel+0x17/0x80 [amdgpu]
[    3.381849]  amdgpu_ttm_init.cold+0x174/0x18e [amdgpu]
[    3.381951]  ? vprintk_default+0x1d/0x20
[    3.381955]  ? printk+0x58/0x6f
[    3.381957]  amdgpu_bo_init.cold+0x4b/0x53 [amdgpu]
[    3.382052]  gmc_v10_0_sw_init+0x304/0x490 [amdgpu]

Regards,
Guchun

-----Original Message-----
From: Koenig, Christian <Christian.Koenig at amd.com> 
Sent: Tuesday, January 18, 2022 3:30 PM
To: Kim, Jonathan <Jonathan.Kim at amd.com>; amd-gfx at lists.freedesktop.org
Cc: Kuehling, Felix <Felix.Kuehling at amd.com>; Chen, Guchun <Guchun.Chen at amd.com>
Subject: Re: [PATCH 1/2] drm/amdgpu: fix build up and tear down of debug vram access bounce buffer

Am 18.01.22 um 00:43 schrieb Jonathan Kim:
> Move the debug sdma vram bounce buffer GART map on device init after 
> when GART is ready to avoid warnings and non-access to SDMA.

Well that doesn't seem to make sense the GART is initialized by the code around the allocation so that should work fine.

Freeing the BO indeed needs to be moved up, but that can still be in the same function.

Also please don't move TTM related code outside of the TTM code in amdgpu.

Regards,
Christian.

>
> Also move bounce buffer tear down after the memory manager has flushed 
> queued work for safety.
>
> Signed-off-by: Jonathan Kim <jonathan.kim at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 +++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c    |  8 --------
>   2 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index da3348fa7b0e..099460d15258 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2378,6 +2378,13 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
>   	if (r)
>   		goto init_failed;
>   
> +	/* GTT bounce buffer for debug vram access over sdma. */
> +	if (amdgpu_bo_create_kernel(adev, PAGE_SIZE, PAGE_SIZE,
> +				AMDGPU_GEM_DOMAIN_GTT,
> +				&adev->mman.sdma_access_bo, NULL,
> +				&adev->mman.sdma_access_ptr))
> +		DRM_WARN("Debug VRAM access will use slowpath MM access\n");
> +
>   	/*
>   	 * retired pages will be loaded from eeprom and reserved here,
>   	 * it should be called after amdgpu_device_ip_hw_init_phase2  since 
> @@ -3872,6 +3879,10 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
>   	}
>   	adev->shutdown = true;
>   
> +	/* remove debug vram sdma access bounce buffer. */
> +	amdgpu_bo_free_kernel(&adev->mman.sdma_access_bo, NULL,
> +					&adev->mman.sdma_access_ptr);
> +
>   	/* make sure IB test finished before entering exclusive mode
>   	 * to avoid preemption on IB test
>   	 * */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index b489cd8abe31..6178ae7ba624 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1855,12 +1855,6 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>   		return r;
>   	}
>   
> -	if (amdgpu_bo_create_kernel(adev, PAGE_SIZE, PAGE_SIZE,
> -				AMDGPU_GEM_DOMAIN_GTT,
> -				&adev->mman.sdma_access_bo, NULL,
> -				adev->mman.sdma_access_ptr))
> -		DRM_WARN("Debug VRAM access will use slowpath MM access\n");
> -
>   	return 0;
>   }
>   
> @@ -1901,8 +1895,6 @@ void amdgpu_ttm_fini(struct amdgpu_device *adev)
>   	ttm_range_man_fini(&adev->mman.bdev, AMDGPU_PL_OA);
>   	ttm_device_fini(&adev->mman.bdev);
>   	adev->mman.initialized = false;
> -	amdgpu_bo_free_kernel(&adev->mman.sdma_access_bo, NULL,
> -					&adev->mman.sdma_access_ptr);
>   	DRM_INFO("amdgpu: ttm finalized\n");
>   }
>   


More information about the amd-gfx mailing list