[PATCH] drm/amdkfd: Remove skiping userptr buffer mapping when mmu notifier marks it as invalid
Alex Deucher
alexdeucher at gmail.com
Wed May 10 15:02:29 UTC 2023
On Wed, May 10, 2023 at 11:00 AM Felix Kuehling <felix.kuehling at amd.com> wrote:
>
> Am 2023-05-09 um 18:17 schrieb Alex Deucher:
> > From: Xiaogang Chen <xiaogang.chen at amd.com>
> >
> > mmu notifier does not always hold mm->sem during call back. That causes
> > a race condition between kfd userprt buffer mapping and mmu notifier
> > which leds to gpu shadder or SDMA access userptr buffer before it has been
> > mapped to gpu VM. Always map userptr buffer to avoid that though it may make
> > some userprt buffers mapped two times.
> >
> > Suggested-by: Felix Kuehling <Felix.Kuehling at amd.com>
> > Signed-off-by: Xiaogang Chen <xiaogang.chen at amd.com>
> > Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
> > Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>
> This patch is no longer needed and should not be applied. It was
> originally applied to amd-staging-drm-next as patch
> fcf00f8d29f2fc6bf00531a1447be28b99073cc3 in November 2022. This fixed a
> race condition due to incorrect assumptions about the mmap lock and MMU
> notifiers. This hunk was added back by my later patch f95f51a4c335
> ("drm/amdgpu: Add notifier lock for KFD userptrs") in December, using
> our own notifier lock that doesn't suffer from those races.
>
Thanks. Dropped.
Alex
> Regards,
> Felix
>
>
> > ---
> > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 10 ----------
> > 1 file changed, 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > index 58a774647573..40078c0a5585 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > @@ -1942,16 +1942,6 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
> > */
> > mutex_lock(&mem->process_info->lock);
> >
> > - /* Lock notifier lock. If we find an invalid userptr BO, we can be
> > - * sure that the MMU notifier is no longer running
> > - * concurrently and the queues are actually stopped
> > - */
> > - if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm)) {
> > - mutex_lock(&mem->process_info->notifier_lock);
> > - is_invalid_userptr = !!mem->invalid;
> > - mutex_unlock(&mem->process_info->notifier_lock);
> > - }
> > -
> > mutex_lock(&mem->lock);
> >
> > domain = mem->domain;
More information about the amd-gfx
mailing list