[PATCH] amd/amdkfd: Trigger segfault for early userptr unmmapping

Xiao, Shane shane.xiao at amd.com
Thu Apr 24 05:59:49 UTC 2025


[Public]

> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig at amd.com>
> Sent: Wednesday, April 23, 2025 8:40 PM
> To: Xiao, Shane <shane.xiao at amd.com>; amd-gfx at lists.freedesktop.org;
> Kuehling, Felix <Felix.Kuehling at amd.com>; Yang, Philip
> <Philip.Yang at amd.com>
> Subject: Re: [PATCH] amd/amdkfd: Trigger segfault for early userptr
> unmmapping
>
>
>
> On 4/23/25 11:50, Shane Xiao wrote:
> > If applications unmap the memory before destroying the userptr, it
> > needs trigger a segfault to notify user space to correct the free
> > sequence in VM debug mode.
> >
> > Signed-off-by: Shane Xiao <shane.xiao at amd.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > index d2ec4130a316..259b38424b7f 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > @@ -2559,6 +2559,16 @@ static int update_invalid_user_pages(struct
> amdkfd_process_info *process_info,
> >                     if (ret != -EFAULT)
> >                             return ret;
> >
> > +                   /* If applications unmaps memory before destroying
> the userptr
> > +                    * from the KFD, trigger a segmentation fault in VM
> debug mode.
> > +                    */
> > +                   if (amdgpu_ttm_adev(bo->tbo.bdev)->debug_vm) {
>
> Using debug_vm works for now, but maybe we should have a separate debug
> flag for this.

I have added a new debug_vm_userptr bit in the new patch series.

>
> > +                           amdgpu_ttm_tt_get_userptr(&bo->tbo,
> userptr);
> > +                           pr_err("User space unmap memory before
> destroying a userptr that refers to it\n");
> > +                           pr_err("The unmap userptr address is
> 0x%llx\n", userptr);
> > +                           send_sig(SIGSEGV, get_pid_task(process_info-
> >pid, PIDTYPE_PID),
> > +0);
>
> Drivers should *never* mess with send_sig() directly. We made the mistake to
> allow that with the KFD already.
>
> We should rather send this as GPU access fault or something like that.

Sure, already done it in the new patch.

Best regards,
Shane

>
> Regards,
> Christian.
>
> > +                   }
> > +
> >                     ret = 0;
> >             }
> >



More information about the amd-gfx mailing list