[PATCH v2] drm/amdkfd: not restore userptr buffer if kfd process has been removed

Russell, Kent Kent.Russell at amd.com
Fri Oct 4 13:52:55 UTC 2024


[AMD Official Use Only - AMD Internal Distribution Only]

> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Chen,
> Xiaogang
> Sent: Thursday, October 3, 2024 6:11 PM
> To: Kuehling, Felix <Felix.Kuehling at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Yang, Philip <Philip.Yang at amd.com>
> Subject: Re: [PATCH v2] drm/amdkfd: not restore userptr buffer if kfd process has
> been removed
>
>
> On 10/3/2024 4:11 PM, Felix Kuehling wrote:
> >
> > On 2024-10-03 16:55, Xiaogang.Chen wrote:
> >> From: Xiaogang Chen <xiaogang.chen at amd.com>
> >>
> >> When kfd process has been terminated not restore userptr buffer after
> >> mmu notifier invalidates a range.
> >>
> >> Signed-off-by: Xiaogang Chen<Xiaogang.Chen at amd.com>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 12 ++++++++----
> >>   1 file changed, 8 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >> index ce5ca304dba9..1df0926b63b3 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> >> @@ -2524,11 +2524,15 @@ int amdgpu_amdkfd_evict_userptr(struct
> >> mmu_interval_notifier *mni,
> >>           /* First eviction, stop the queues */
> >>           r = kgd2kfd_quiesce_mm(mni->mm,
> >>                          KFD_QUEUE_EVICTION_TRIGGER_USERPTR);
> >> -        if (r)
> >> +
> >> +        if (r && r != -ESRCH)
> >>               pr_err("Failed to quiesce KFD\n");
> >> -        queue_delayed_work(system_freezable_wq,
> >> -            &process_info->restore_userptr_work,
> >> -            msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS));
> >> +
> >> +        if (!r || r != -ESRCH) {
> >
> > This condition is always true.
> >
> so sure why this condition is always true?  kgd2kfd_quiesce_mm can
> return -ESRCH when it cannot find kfd process correspondent to mni->mm,
> then above checking will be false, then will not queue restore work item
> into system_freezable_wq.

If you expand the 2 conditions, it becomes "if (r !=0 || r != -3)", which will always be true for any value of r.

 Kent

>
> Regards
>
> Xiaogang
>
> > Regards,
> >   Felix
> >
> >
> >> + queue_delayed_work(system_freezable_wq,
> >> +                &process_info->restore_userptr_work,
> >> + msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS));
> >> +        }
> >>       }
> >>       mutex_unlock(&process_info->notifier_lock);


More information about the amd-gfx mailing list