Reporting a slab-use-after-free in amdgpu

Christian König christian.koenig at amd.com
Mon Apr 29 07:44:04 UTC 2024


Hi guys,

yeah that is a well known issue but actually completely harmless.

What happens is that a trace function accesses a stale pointer to print 
some additional value into the trace log.

That memory might have been reused and the information is now outdated, 
but the worst thing that can happen is that the value in the logs is 
nonsense.

I have a patch in the queue to fix this, should be upstream and 
backported in the next few weeks.

Regards,
Christian.

Am 29.04.24 um 04:15 schrieb Joonkyo Jung:
> Hi,
>
> Thank you for patching two of the bugs we have reported!
> I was just wondering if there's any news on the one other bug we have 
> reported:
> BUG: KASAN: slab-use-after-free in amdgpu_bo_move+0x1479/0x1550.
>
> I see that there is a gitlab 
> issue(https://gitlab.freedesktop.org/drm/amd/-/issues/3171) created 
> for this bug,
> and there also is a 
> patch(https://lists.freedesktop.org/archives/amd-gfx/2024-March/105680.html) 
> that Christian made for this.
> Though, it seems that the issue is not resolved yet, and the patch is 
> not yet pushed to mainstream branches.
> So I was wondering, do you have any plans for pushing this patch? If 
> so, would it be possible for us to get a Reported-by tag on the patch?
>
> Best,
> Joonkyo
>
> On Fri, Mar 8, 2024 at 4:32 PM Joonkyo Jung <joonkyoj at yonsei.ac.kr> wrote:
>
>     Hi Vitaly,
>
>     No worries, thank you for working on the patches!
>
>     I have also confirmed that with the inflight patch, issue No.1
>     (use-after-free) seems to be resolved.
>     However, I have reproduced issue No.3 (slab-use-after-free) even
>     with the patch for issue No.1 applied - if it's the first program
>     tested after reboot.
>     (i.e., if any other bugs are tested before the
>     slab-use-after-free, it does not reproduce).
>
>     Could you check if the bug reproduces in this condition for you too?
>     I will check and see why this is happening and update you if I
>     have something new.
>
>     Thank you!
>
>     Best,
>     Joonkyo
>
>
>
>     On Fri, Mar 8, 2024 at 12:45 PM vitaly prosyak <vprosyak at amd.com>
>     wrote:
>
>         Hi Joonkyo,
>         Sorry for the delay.
>         Yes, sure, I reproduced issue 2 (null-ptr-deref in amdgpu) and
>         I will provide the fix soon.
>         However, issue No. 3 is no longer reproducible if the recent
>         patch inflight is applied which fixes issue No 1.
>
>         Do you see the same behavior?
>
>         Thanks in advance, Vitaly
>
>         On 2024-03-07 20:18, Joonkyo Jung wrote:
>>         Hello,
>>         thank you for patching the first bug we have sent!
>>
>>         Just a quick touch base with you, to ask if there has been
>>         any update on our other two bugs.
>>         They were each sent with emails titled
>>         "Reporting a slab-use-after-free in amdgpu" (this one)
>>         "Reporting a null-ptr-deref in amdgpu".
>>
>>         Thank you!
>>
>>         Best,
>>         Joonkyo
>>
>>
>>         2024년 2월 16일 (금) 오후 6:22, Joonkyo Jung
>>         <joonkyoj at yonsei.ac.kr>님이 작성:
>>
>>             Hello,
>>
>>             We would like to report a slab-use-after-free bug in the
>>             AMDGPU DRM driver in the linux kernel v6.8-rc4 that we
>>             found with our customized Syzkaller.
>>             The bug can be triggered by sending two ioctls to the
>>             AMDGPU DRM driver in succession.
>>
>>             In amdgpu_bo_move, struct ttm_resource *old_mem =
>>             bo->resource is assigned.
>>             As you can see on the alloc & free stack calls, on the
>>             same function amdgpu_bo_move,
>>             amdgpu_move_blit in the end frees bo->resource at
>>             ttm_bo_move_accel_cleanup with ttm_bo_wait_free_node(bo,
>>             man->use_tt).
>>             But amdgpu_bo_move continues after that, reaching
>>             trace_amdgpu_bo_move(abo, new_mem->mem_type,
>>             old_mem->mem_type) at the end, causing the use-after-free
>>             bug.
>>
>>             Steps to reproduce are as below.
>>             union drm_amdgpu_gem_create *arg1;
>>
>>             arg1 = malloc(sizeof(union drm_amdgpu_gem_create));
>>             arg1->in.bo_size = 0x8;
>>             arg1->in.alignment = 0x0;
>>             arg1->in.domains = 0x4;
>>             arg1->in.domain_flags = 0x9;
>>             ioctl(fd, 0xc0206440, arg1);
>>
>>             arg1->in.bo_size = 0x7fffffff;
>>             arg1->in.alignment = 0x0;
>>             arg1->in.domains = 0x4;
>>             arg1->in.domain_flags = 0x9;
>>             ioctl(fd, 0xc0206440, arg1);
>>
>>             The KASAN report is as follows:
>>             ==================================================================
>>             BUG: KASAN: slab-use-after-free in
>>             amdgpu_bo_move+0x1479/0x1550
>>             Read of size 4 at addr ffff88800f5bee80 by task
>>             syz-executor/219
>>             Call Trace:
>>              <TASK>
>>              amdgpu_bo_move+0x1479/0x1550
>>              ttm_bo_handle_move_mem+0x4d0/0x700
>>              ttm_mem_evict_first+0x945/0x1230
>>              ttm_bo_mem_space+0x6c7/0x940
>>              ttm_bo_validate+0x286/0x650
>>              ttm_bo_init_reserved+0x34c/0x490
>>              amdgpu_bo_create+0x94b/0x1610
>>              amdgpu_bo_create_user+0xa3/0x130
>>              amdgpu_gem_create_ioctl+0x4bc/0xc10
>>              drm_ioctl_kernel+0x300/0x410
>>              drm_ioctl+0x648/0xb30
>>              amdgpu_drm_ioctl+0xc8/0x160
>>              </TASK>
>>
>>             Allocated by task 219:
>>              kmalloc_trace+0x211/0x390
>>              amdgpu_vram_mgr_new+0x1d6/0xbe0
>>              ttm_resource_alloc+0xfd/0x1e0
>>              ttm_bo_mem_space+0x255/0x940
>>              ttm_bo_validate+0x286/0x650
>>              ttm_bo_init_reserved+0x34c/0x490
>>              amdgpu_bo_create+0x94b/0x1610
>>              amdgpu_bo_create_user+0xa3/0x130
>>              amdgpu_gem_create_ioctl+0x4bc/0xc10
>>              drm_ioctl_kernel+0x300/0x410
>>              drm_ioctl+0x648/0xb30
>>              amdgpu_drm_ioctl+0xc8/0x160
>>
>>             Freed by task 219:
>>              kfree+0x111/0x2d0
>>              ttm_resource_free+0x17e/0x1e0
>>              ttm_bo_move_accel_cleanup+0x77e/0x9b0
>>              amdgpu_move_blit+0x3db/0x670
>>              amdgpu_bo_move+0xfa2/0x1550
>>              ttm_bo_handle_move_mem+0x4d0/0x700
>>              ttm_mem_evict_first+0x945/0x1230
>>              ttm_bo_mem_space+0x6c7/0x940
>>              ttm_bo_validate+0x286/0x650
>>              ttm_bo_init_reserved+0x34c/0x490
>>              amdgpu_bo_create+0x94b/0x1610
>>              amdgpu_bo_create_user+0xa3/0x130
>>              amdgpu_gem_create_ioctl+0x4bc/0xc10
>>              drm_ioctl_kernel+0x300/0x410
>>              drm_ioctl+0x648/0xb30
>>              amdgpu_drm_ioctl+0xc8/0x160
>>
>>             The buggy address belongs to the object at ffff88800f5bee70
>>              which belongs to the cache kmalloc-96 of size 96
>>             The buggy address is located 16 bytes inside of
>>              freed 96-byte region [ffff88800f5bee70, ffff88800f5beed0)
>>
>>             Should you need any more information, please do not
>>             hesitate to contact us.
>>
>>             Best regards,
>>             Joonkyo Jung
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20240429/61e033fb/attachment-0001.htm>


More information about the amd-gfx mailing list