[Intel-xe] [PATCH] drm/xe: Enable scratch page when page fault is enabled

Tue Jul 11 02:11:59 UTC 2023

Hi Bruce,

Why do we want to enable scratch page on xe driver? Mapping whole process address space to scratch page or mapping invalid (aka out of bound) GPU virtual address to scratch page can hide application programming bug. It can even hide kmd or GPU HW bug. My philosophy is, when there is a bug, fail the application immediately and let user know the problem immediately. Hiding issues fails us for long run.

On I915, this feature has caused us a lot of troubles. Once it is enabled, application can depend on it so it is very hard to remove. As you know our plan is to disable scratch page on i915. So I don't understand why we want enable this "feature" in the first place. 

Thanks,
Oak

> -----Original Message-----
> From: Chang, Yu bruce <yu.bruce.chang at intel.com>
> Sent: July 10, 2023 6:07 PM
> To: intel-xe at lists.freedesktop.org
> Cc: Chang, Yu bruce <yu.bruce.chang at intel.com>; Zeng, Oak
> <oak.zeng at intel.com>; Welty, Brian <brian.welty at intel.com>; Vishwanathapura,
> Niranjana <niranjana.vishwanathapura at intel.com>; Summers, Stuart
> <stuart.summers at intel.com>; Brost, Matthew <matthew.brost at intel.com>
> Subject: [PATCH] drm/xe: Enable scratch page when page fault is enabled
> 
> This is just a proposal and not fully tested yet. Once it is got
> agreenment, more efforts needed to make it fully working, including igt
> test changes.
> 
> i915 can use scratch page even when page fault is enabled, this patch
> is trying to port this feature over.
> 
> The current i915 solution changes page table directly which may be hard to
> make to upstream, so a more complex solution is needed to apply to the
> current Xe framework if following the existing i915 solution.
> 
> This patch is trying to make the minimum impact to the existing driver,
> but still enable the scratch page support.
> 
> So, the idea is to bind a scratch vma if the page fault is from an invalid
> access. This patch is taking advantage of null pte at this point, we may
> introduce a special vma for scratch vma if needed. After the bind, the user
> app can continue to run without causing a fatal failure or reset and
> stop.
> 
> In case the app will bind this scratch vma to a valid address, it will
> fail the bind, this patch will handle the failre and unbind the scrach
> vma[s], so that the user binding will be retried to the valid address.
> 
> This patch only kicks in when there is a failure for both page fault
> and bind, so it should has no impact to the exiating code path. On
> another hand, it uses actual page tables instead of special scratch
> page tables, so it enables potential not to invalidate TLBs when doing
> unbind if all upper layer page tables are still being used.
> 
> Cc: Oak Zeng <oak.zeng at intel.com>
> Cc: Brian Welty <brian.welty at intel.com>
> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura at intel.com>
> Cc: Stuart Summers <stuart.summers at intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Signed-off-by: Bruce Chang <yu.bruce.chang at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt_pagefault.c |  8 +++--
>  drivers/gpu/drm/xe/xe_vm.c           | 52 ++++++++++++++++++++++++----
>  drivers/gpu/drm/xe/xe_vm.h           |  2 ++
>  3 files changed, 53 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> index 6faebd02f3fb..01d316baf0e4 100644
> --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> @@ -138,8 +138,12 @@ static int handle_pagefault(struct xe_gt *gt, struct
> pagefault *pf)
>  	write_locked = true;
>  	vma = lookup_vma(vm, pf->page_addr);
>  	if (!vma) {
> -		ret = -EINVAL;
> -		goto unlock_vm;
> +		if (vm->flags & XE_VM_FLAG_SCRATCH_PAGE)
> +			ret = xe_bind_scratch_vma(vm, pf->page_addr, SZ_64K);
> +		else
> +			ret = -EINVAL;
> +		if (ret)
> +			goto unlock_vm;
>  	}
> 
>  	if (!xe_vma_is_userptr(vma) || !xe_vma_userptr_check_repin(vma)) {
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index dbff75a9087b..dc2d0cdfb0dc 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -1241,7 +1241,8 @@ struct xe_vm *xe_vm_create(struct xe_device *xe, u32
> flags)
>  		}
>  	}
> 
> -	if (flags & XE_VM_FLAG_SCRATCH_PAGE) {
> +	if (flags & XE_VM_FLAG_SCRATCH_PAGE &&
> +	    (!(flags & XE_VM_FLAG_FAULT_MODE))) {
>  		for_each_tile(tile, xe, id) {
>  			if (!vm->pt_root[id])
>  				continue;
> @@ -1931,10 +1932,6 @@ int xe_vm_create_ioctl(struct drm_device *dev, void
> *data,
>  	if (XE_IOCTL_ERR(xe, args->flags & ~ALL_DRM_XE_VM_CREATE_FLAGS))
>  		return -EINVAL;
> 
> -	if (XE_IOCTL_ERR(xe, args->flags &
> DRM_XE_VM_CREATE_SCRATCH_PAGE &&
> -			 args->flags & DRM_XE_VM_CREATE_FAULT_MODE))
> -		return -EINVAL;
> -
>  	if (XE_IOCTL_ERR(xe, args->flags &
> DRM_XE_VM_CREATE_COMPUTE_MODE &&
>  			 args->flags & DRM_XE_VM_CREATE_FAULT_MODE))
>  		return -EINVAL;
> @@ -2603,6 +2600,44 @@ static int prep_replacement_vma(struct xe_vm *vm,
> struct xe_vma *vma)
>  	return 0;
>  }
> 
> +int xe_bind_scratch_vma(struct xe_vm *vm, u64 addr, u64 size)
> +{
> +	struct xe_vma *vma;
> +
> +	if (!vm->size)
> +		return -ENOMEM;
> +
> +	vma = xe_vma_create(vm, NULL, 0, addr, addr + size - 1, false, true, 0);
> +	if (!vma)
> +		return -ENOMEM;
> +	xe_vm_insert_vma(vm, vma);
> +
> +	/*
> +	 * fault will handle the bind
> +	*/
> +
> +	return 0;
> +}
> +
> +int xe_unbind_scratch_vma(struct xe_vm *vm, u64 addr, u64 range)
> +{
> +	struct xe_vma *vma, lookup;
> +
> +	lookup.start = addr;
> +	lookup.end = addr + range - 1;
> +
> +	vma = xe_vm_find_overlapping_vma(vm, &lookup);
> +	if (!vma)
> +		return -ENXIO;
> +
> +	if (xe_vma_is_null(vma)) {
> +		prep_vma_destroy(vm, vma);
> +		xe_vm_unbind(vm, vma, NULL, NULL, 0, NULL);
> +	}
> +
> +	return 0;
> +}
> +
>  /*
>   * Find all overlapping VMAs in lookup range and add to a list in the returned
>   * VMA, all of VMAs found will be unbound. Also possibly add 2 new VMAs that
> @@ -3220,10 +3255,13 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void
> *data, struct drm_file *file)
>  		u64 range = bind_ops[i].range;
>  		u64 addr = bind_ops[i].addr;
>  		u32 op = bind_ops[i].op;
> -
> +retry:
>  		err = __vm_bind_ioctl_lookup_vma(vm, bos[i], addr, range, op);
> -		if (err)
> +		if (err) {
> +			if (!xe_unbind_scratch_vma(vm, addr, range))
> +				goto retry;
>  			goto release_vm_lock;
> +		}
>  	}
> 
>  	for (i = 0; i < args->num_binds; ++i) {
> diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
> index 5edb7771629c..74f99b38f60d 100644
> --- a/drivers/gpu/drm/xe/xe_vm.h
> +++ b/drivers/gpu/drm/xe/xe_vm.h
> @@ -140,6 +140,8 @@ static inline void xe_vm_queue_rebind_worker(struct
> xe_vm *vm)
>  	queue_work(vm->xe->ordered_wq, &vm->preempt.rebind_work);
>  }
> 
> +int xe_bind_scratch_vma(struct xe_vm *vm, u64 addr, u64 size);
> +
>  /*
>   * XE_ONSTACK_TV is used to size the tv_onstack array that is input
>   * to xe_vm_lock_dma_resv() and xe_vm_unlock_dma_resv().
> --
> 2.25.1