[PATCH 1/1] drm/amdgpu: Set vmbo destroy after pt bo is created

Christian König christian.koenig at amd.com
Tue Oct 4 11:13:10 UTC 2022


Am 03.10.22 um 19:20 schrieb Philip Yang:
> Under VRAM usage pression, map to GPU may fail to create pt bo and
> vmbo->shadow_list is not initialized, then ttm_bo_release calling
> amdgpu_bo_vm_destroy to access vmbo->shadow_list generates below
> dmesg and NULL pointer access backtrace:
>
> Set vmbo destroy callback to amdgpu_bo_vm_destroy only after creating pt
> bo successfully, otherwise use default callback amdgpu_bo_destroy.
>
> amdgpu: amdgpu_vm_bo_update failed
> amdgpu: update_gpuvm_pte() failed
> amdgpu: Failed to map bo to gpuvm
> amdgpu 0000:43:00.0: amdgpu: Failed to map peer:0000:43:00.0 mem_domain:2
> BUG: kernel NULL pointer dereference, address:
>   RIP: 0010:amdgpu_bo_vm_destroy+0x4d/0x80 [amdgpu]
>   Call Trace:
>    <TASK>
>    ttm_bo_release+0x207/0x320 [amdttm]
>    amdttm_bo_init_reserved+0x1d6/0x210 [amdttm]
>    amdgpu_bo_create+0x1ba/0x520 [amdgpu]
>    amdgpu_bo_create_vm+0x3a/0x80 [amdgpu]
>    amdgpu_vm_pt_create+0xde/0x270 [amdgpu]
>    amdgpu_vm_ptes_update+0x63b/0x710 [amdgpu]
>    amdgpu_vm_update_range+0x2e7/0x6e0 [amdgpu]
>    amdgpu_vm_bo_update+0x2bd/0x600 [amdgpu]
>    update_gpuvm_pte+0x160/0x420 [amdgpu]
>    amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x313/0x1130 [amdgpu]
>    kfd_ioctl_map_memory_to_gpu+0x115/0x390 [amdgpu]
>    kfd_ioctl+0x24a/0x5b0 [amdgpu]
>
> Signed-off-by: Philip Yang <Philip.Yang at amd.com>

Mhm, quite some hack because or init and fini sequence is still not 
ideal. Please add a code comment explaining why we do this.

With that done the patch is Reviewed-by: Christian König 
<christian.koenig at amd.com>.

Thanks,
Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 4570ad449390..ae924db72b62 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -688,11 +688,11 @@ int amdgpu_bo_create_vm(struct amdgpu_device *adev,
>   	 * num of amdgpu_vm_pt entries.
>   	 */
>   	BUG_ON(bp->bo_ptr_size < sizeof(struct amdgpu_bo_vm));
> -	bp->destroy = &amdgpu_bo_vm_destroy;
>   	r = amdgpu_bo_create(adev, bp, &bo_ptr);
>   	if (r)
>   		return r;
>   
> +	bo_ptr->tbo.destroy = &amdgpu_bo_vm_destroy;
>   	*vmbo_ptr = to_amdgpu_bo_vm(bo_ptr);
>   	INIT_LIST_HEAD(&(*vmbo_ptr)->shadow_list);
>   	return r;



More information about the amd-gfx mailing list