[RFC] drm/xe: Ensure write lock is held when calling xe_vma_rebind()

Wed Jun 5 16:44:36 UTC 2024

On Wed, Jun 05, 2024 at 12:16:07PM +0200, Nirmoy Das wrote:
> xe_vma_rebind() expects write vm lock is held so start out with a
> read lock and upgrade if a vma rebind is required.
> 
> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1894
> Fixes: bf69918b7199 ("drm/xe: Use xe_vma_ops to implement page fault rebinds")
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Oak Zeng <oak.zeng at intel.com>
> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt_pagefault.c | 25 +++++++++----------------
>  1 file changed, 9 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> index 040dd142c49c..543cfa9892f6 100644
> --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> @@ -153,24 +153,15 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
>  		return -EINVAL;
>  
>  retry_userptr:
> -	/*
> -	 * TODO: Avoid exclusive lock if VM doesn't have userptrs, or
> -	 * start out read-locked?
> -	 */
> -	down_write(&vm->lock);
> -	write_locked = true;
> +	/* start out with read-locked and upgrade of vma is needed a rebind */
> +	down_read(&vm->lock);
> +	write_locked = false;
>  	vma = lookup_vma(vm, pf->page_addr);
>  	if (!vma) {
>  		ret = -EINVAL;
>  		goto unlock_vm;
>  	}
>  
> -	if (!xe_vma_is_userptr(vma) ||
> -	    !xe_vma_userptr_check_repin(to_userptr_vma(vma))) {
> -		downgrade_write(&vm->lock);
> -		write_locked = false;
> -	}
> -
>  	trace_xe_vma_pagefault(vma);
>  
>  	atomic = access_is_atomic(pf->access_type);
> @@ -179,9 +170,14 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
>  	if (vma_is_valid(tile, vma) && !atomic)
>  		goto unlock_vm;
>  
> +	/* xe_vma_rebind() expects a write lock so upgrade the vm lock */
> +	up_read(&vm->lock);
> +	down_write(&vm->lock);

You can't do this, or even if it works it is bad, bad practice to drop
the lock as someone else could come in and acquire the lock in write
mode. In this case, a user unbind of the VMA could happen and the VMA
could be destoryed. Now the VMA refs below are accessing corrupt memory.

I have on going refactor [1] here which just takes the vm->lock in write
mode for page faults. I suggest we just review / use that now.

Long term I think we need to audit the read / write usage of vm->lock as
I'm almost certain we have broken the intended usage of this lock. Or
honestly maybe just change this lock to a mutex as I'm unsure if the
read vs. write buys us anything and complicates the driver.

Matt

[1] https://patchwork.freedesktop.org/series/133628/

> +	write_locked = true;
>  	/* TODO: Validate fault */
>  
> -	if (xe_vma_is_userptr(vma) && write_locked) {
> +	if (xe_vma_is_userptr(vma) &&
> +	    xe_vma_userptr_check_repin(to_userptr_vma(vma))) {
>  		struct xe_userptr_vma *uvma = to_userptr_vma(vma);
>  
>  		spin_lock(&vm->userptr.invalidated_lock);
> @@ -191,9 +187,6 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
>  		ret = xe_vma_userptr_pin_pages(uvma);
>  		if (ret)
>  			goto unlock_vm;
> -
> -		downgrade_write(&vm->lock);
> -		write_locked = false;
>  	}
>  
>  	/* Lock VM and BOs dma-resv */
> -- 
> 2.42.0
>