[PATCH] drm/amdgpu: Fix a use-after-free

Christian König ckoenig.leichtzumerken at gmail.com
Tue May 18 13:13:09 UTC 2021


Am 18.05.21 um 05:00 schrieb xinhui pan:
> looks like we forget to set ttm->sg to NULL.
> Hit panic below
>
> [ 1235.844104] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b7b4b: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
> [ 1235.862186] CPU: 5 PID: 25180 Comm: kfdtest Tainted: G        W         5.11.0+ #114
> [ 1235.870633] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 1401 11/26/2019
> [ 1235.880689] RIP: 0010:__sg_free_table+0x55/0x90
> [ 1235.885654] Code: 39 c6 77 1c 41 c7 46 0c 00 00 00 00 85 d2 74 46 49 c7 06 00 00 00 00 5b 41 5c 41 5d 41 5e 5d c3 8d 48 ff 49 89 c8 48 c1 e1 05 <48> 8b 1c 0f 44 29 c6 41 89 76 0c 48 83 e3 f8
> [ 1235.906084] RSP: 0000:ffffad1c430cfbd0 EFLAGS: 00010202
> [ 1235.911671] RAX: 0000000000000080 RBX: ffff93e266d2e6d8 RCX: 0000000000000fe0
> [ 1235.919393] RDX: 0000000000000000 RSI: 00000000a56b6b6b RDI: 6b6b6b6b6b6b6b6b
> [ 1235.927190] RBP: ffffad1c430cfbf0 R08: 000000000000007f R09: 0000000000000001
> [ 1235.934970] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000080
> [ 1235.942766] R13: ffffffff9e7fe9f0 R14: ffff93e20c3488b0 R15: ffff93e270bc8b20
> [ 1235.950563] FS:  00007f5013ca63c0(0000) GS:ffff93f075200000(0000) knlGS:0000000000000000
> [ 1235.959404] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1235.965683] CR2: 00007ff44b08faff CR3: 000000020f84e002 CR4: 00000000003706e0
> [ 1235.973472] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1235.981269] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1235.989074] Call Trace:
> [ 1235.991751]  sg_free_table+0x17/0x20
> [ 1235.995667]  amdgpu_ttm_backend_unbind.cold+0x4d/0xf7 [amdgpu]
> [ 1236.002288]  amdgpu_ttm_backend_destroy+0x29/0x130 [amdgpu]
> [ 1236.008464]  ttm_tt_destroy+0x1e/0x30 [ttm]
> [ 1236.013066]  ttm_bo_cleanup_memtype_use+0x51/0xa0 [ttm]
> [ 1236.018783]  ttm_bo_release+0x262/0xa50 [ttm]
> [ 1236.023547]  ttm_bo_put+0x82/0xd0 [ttm]
> [ 1236.027766]  amdgpu_bo_unref+0x26/0x50 [amdgpu]
> [ 1236.032809]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x7aa/0xd90 [amdgpu]
> [ 1236.040400]  kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu]
> [ 1236.046912]  kfd_ioctl+0x463/0x690 [amdgpu]
> [ 1236.051632]  ? kfd_dev_is_large_bar+0xf0/0xf0 [amdgpu]
> [ 1236.057360]  __x64_sys_ioctl+0x91/0xc0
> [ 1236.061457]  do_syscall_64+0x38/0x90
> [ 1236.065383]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 1236.070920] RIP: 0033:0x7f5013dbe50b
>
> Signed-off-by: xinhui pan <xinhui.pan at amd.com>

Maybe shorten the backtrace a bit, the register and absolute address 
information are completely useless unless you have the exact same kernel 
binary.

Apart from that patch is Reviewed-by: Christian König 
<christian.koenig at amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 89cd93b24404..754f9847497d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1158,6 +1158,7 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev,
>   	if (gtt && gtt->userptr) {
>   		amdgpu_ttm_tt_set_user_pages(ttm, NULL);
>   		kfree(ttm->sg);
> +		ttm->sg = NULL;
>   		ttm->page_flags &= ~TTM_PAGE_FLAG_SG;
>   		return;
>   	}



More information about the amd-gfx mailing list