[RFC] drm/xe: Ensure write lock is held when calling xe_vma_rebind()

Thu Jun 6 16:30:17 UTC 2024

On Thu, Jun 06, 2024 at 12:09:39PM +0200, Nirmoy Das wrote:
> Hi Matt,
> 
> On 6/5/2024 6:44 PM, Matthew Brost wrote:
> > On Wed, Jun 05, 2024 at 12:16:07PM +0200, Nirmoy Das wrote:
> > > xe_vma_rebind() expects write vm lock is held so start out with a
> > > read lock and upgrade if a vma rebind is required.
> > > 
> > > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1894
> > > Fixes: bf69918b7199 ("drm/xe: Use xe_vma_ops to implement page fault rebinds")
> > > Cc: Matthew Brost <matthew.brost at intel.com>
> > > Cc: Oak Zeng <oak.zeng at intel.com>
> > > Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_gt_pagefault.c | 25 +++++++++----------------
> > >   1 file changed, 9 insertions(+), 16 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > > index 040dd142c49c..543cfa9892f6 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > > +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > > @@ -153,24 +153,15 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > >   		return -EINVAL;
> > >   retry_userptr:
> > > -	/*
> > > -	 * TODO: Avoid exclusive lock if VM doesn't have userptrs, or
> > > -	 * start out read-locked?
> > > -	 */
> > > -	down_write(&vm->lock);
> > > -	write_locked = true;
> > > +	/* start out with read-locked and upgrade of vma is needed a rebind */
> > > +	down_read(&vm->lock);
> > > +	write_locked = false;
> > >   	vma = lookup_vma(vm, pf->page_addr);
> > >   	if (!vma) {
> > >   		ret = -EINVAL;
> > >   		goto unlock_vm;
> > >   	}
> > > -	if (!xe_vma_is_userptr(vma) ||
> > > -	    !xe_vma_userptr_check_repin(to_userptr_vma(vma))) {
> > > -		downgrade_write(&vm->lock);
> > > -		write_locked = false;
> > > -	}
> > > -
> > >   	trace_xe_vma_pagefault(vma);
> > >   	atomic = access_is_atomic(pf->access_type);
> > > @@ -179,9 +170,14 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > >   	if (vma_is_valid(tile, vma) && !atomic)
> > >   		goto unlock_vm;
> > > +	/* xe_vma_rebind() expects a write lock so upgrade the vm lock */
> > > +	up_read(&vm->lock);
> > > +	down_write(&vm->lock);
> > You can't do this, or even if it works it is bad, bad practice to drop
> > the lock as someone else could come in and acquire the lock in write
> > mode. In this case, a user unbind of the VMA could happen and the VMA
> > could be destoryed.
> 
> I didn't verify but assumed vma is ref_counted but anyways I knew this
> wasn't correct
> 

VMAs are not ref counted which indeed is very tricky and exposes us to a
bunch of possible bugs. The current scheme is if the VMA is in GPUVM tree,
it is safe to use the VMA while under the vm->lock. Once the VMA is
removed from GPUVM tree, it can only be referenced in the pending
destroy code.

I did post a patch to refcount VMAs a while back that got lost in the
stack [1].

Matt

[1] https://patchwork.freedesktop.org/series/132573/

> so tagged it RFC :)
> 
> >   Now the VMA refs below are accessing corrupt memory.
> > 
> > I have on going refactor [1] here which just takes the vm->lock in write
> > mode for page faults. I suggest we just review / use that now.
> I will do that.
> > 
> > Long term I think we need to audit the read / write usage of vm->lock as
> > I'm almost certain we have broken the intended usage of this lock. Or
> > honestly maybe just change this lock to a mutex as I'm unsure if the
> > read vs. write buys us anything and complicates the driver.
> 
> Yes, that might be simpler.
> 
> 
> Thanks,
> 
> Nirmoy
> 
> > 
> > Matt
> > 
> > [1] https://patchwork.freedesktop.org/series/133628/
> > 
> > > +	write_locked = true;
> > >   	/* TODO: Validate fault */
> > > -	if (xe_vma_is_userptr(vma) && write_locked) {
> > > +	if (xe_vma_is_userptr(vma) &&
> > > +	    xe_vma_userptr_check_repin(to_userptr_vma(vma))) {
> > >   		struct xe_userptr_vma *uvma = to_userptr_vma(vma);
> > >   		spin_lock(&vm->userptr.invalidated_lock);
> > > @@ -191,9 +187,6 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
> > >   		ret = xe_vma_userptr_pin_pages(uvma);
> > >   		if (ret)
> > >   			goto unlock_vm;
> > > -
> > > -		downgrade_write(&vm->lock);
> > > -		write_locked = false;
> > >   	}
> > >   	/* Lock VM and BOs dma-resv */
> > > -- 
> > > 2.42.0
> > >