[PATCH v4 11/20] drm/xe: Allow CPU address mirror VMA unbind with gpu bindings for madvise

Mon Jun 23 11:45:20 UTC 2025

On Mon, Jun 23, 2025 at 11:48:18AM +0530, Ghimiray, Himal Prasad wrote:
> 
> 
> On 23-06-2025 11:22, Matthew Brost wrote:
> > On Fri, Jun 13, 2025 at 06:25:49PM +0530, Himal Prasad Ghimiray wrote:
> > > In the case of the MADVISE ioctl, if the start or end addresses fall
> > > within a VMA and existing SVM ranges are present, remove the existing
> > > SVM mappings. Then, continue with ops_parse to create new VMAs by REMAP
> > > unmapping of old one.
> > > 
> > > v2 (Matthew Brost)
> > > - Use vops flag to call unmapping of ranges in vm_bind_ioctl_ops_parse
> > > - Rename the function
> > > 
> > > Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_svm.c | 27 +++++++++++++++++++++++++++
> > >   drivers/gpu/drm/xe/xe_svm.h |  8 ++++++++
> > >   drivers/gpu/drm/xe/xe_vm.c  |  8 ++++++--
> > >   3 files changed, 41 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> > > index 19420635f1fa..df6992ee2e2d 100644
> > > --- a/drivers/gpu/drm/xe/xe_svm.c
> > > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > > @@ -935,6 +935,33 @@ bool xe_svm_has_mapping(struct xe_vm *vm, u64 start, u64 end)
> > >   	return drm_gpusvm_has_mapping(&vm->svm.gpusvm, start, end);
> > >   }
> > > +/**
> > > + * xe_svm_unmap_address_range - UNMAP SVM mappings and ranges
> > > + * @start: start addr
> > > + * @end: end addr
> > > + *
> > > + * This function UNMAPS svm ranges if start or end address are inside them.
> > > + */
> > > +void xe_svm_unmap_address_range(struct xe_vm *vm, u64 start, u64 end)
> > > +{
> > > +	struct drm_gpusvm_notifier *notifier, *next;
> > > +
> > > +	lockdep_assert_held_write(&vm->lock);
> > > +
> > > +	drm_gpusvm_for_each_notifier_safe(notifier, next, &vm->svm.gpusvm, start, end) {
> > > +		struct drm_gpusvm_range *range, *__next;
> > > +
> > > +		drm_gpusvm_for_each_range_safe(range, __next, notifier, start, end) {
> > > +			if (start > drm_gpusvm_range_start(range) ||
> > > +			    end < drm_gpusvm_range_end(range)) {
> > > +				if (IS_DGFX(vm->xe) && xe_svm_range_in_vram(to_xe_range(range)))
> > > +					drm_gpusvm_range_evict(&vm->svm.gpusvm, range);
> > 
> > I think you could use xe_svm_range_migrate_to_smem here but also I don't
> > eviction is strickly required here either. This akin to a partial unmap
> > and we don't evict there. Any reason that I'm missing here?
> 
> If previous ranges had devmem pages allocated, and eviction did not occur,
> subsequent VRAM allocations for smaller ranges were failing.
> 
> Scenario:
> 
> A 2 MiB range existed with VRAM allocation.
> A madvise call triggered a split, invoking xe_svm_unmap_address_range.
> Without eviction, the 64 KiB sub-ranges failed to allocate VRAM during
> subsequent page faults.
> As a result, bindings were being forced from system memory (SMEM) instead of
> VRAM.

Right. But would allocate xe_svm_alloc_vram but not fail (on
!migrate.cpages in drm_gpusvm_migrate_to_devmem) and the subsequent
get_pages not find the old VRAM pages still in place? I guess I'm
looking now and our error handling in the fault handler / prefetch
doesn't handle this scenario correctly but I think it could. Let me try
to tweak these paths today and get back to you - if we can fix that then
I believe this could avoided. For now, yea I think this is correct.

Matt

>  >
> > Matt
> > 
> > > +				__xe_svm_garbage_collector(vm, to_xe_range(range));
> > > +			}
> > > +		}
> > > +	}
> > > +}
> > > +
> > >   /**
> > >    * xe_svm_bo_evict() - SVM evict BO to system memory
> > >    * @bo: BO to evict
> > > diff --git a/drivers/gpu/drm/xe/xe_svm.h b/drivers/gpu/drm/xe/xe_svm.h
> > > index af8f285b6caa..4e5d42323679 100644
> > > --- a/drivers/gpu/drm/xe/xe_svm.h
> > > +++ b/drivers/gpu/drm/xe/xe_svm.h
> > > @@ -92,6 +92,9 @@ bool xe_svm_range_validate(struct xe_vm *vm,
> > >   u64 xe_svm_find_vma_start(struct xe_vm *vm, u64 addr, u64 end,  struct xe_vma *vma);
> > >   u8 xe_svm_ranges_zap_ptes_in_range(struct xe_vm *vm, u64 start, u64 end);
> > > +
> > > +void xe_svm_unmap_address_range(struct xe_vm *vm, u64 start, u64 end);
> > > +
> > >   /**
> > >    * xe_svm_range_has_dma_mapping() - SVM range has DMA mapping
> > >    * @range: SVM range
> > > @@ -312,6 +315,11 @@ u8 xe_svm_ranges_zap_ptes_in_range(struct xe_vm *vm, u64 start, u64 end)
> > >   	return 0;
> > >   }
> > > +static inline
> > > +void xe_svm_unmap_address_range(struct xe_vm *vm, u64 start, u64 end)
> > > +{
> > > +}
> > > +
> > >   #define xe_svm_assert_in_notifier(...) do {} while (0)
> > >   #define xe_svm_range_has_dma_mapping(...) false
> > > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > > index e059d9810d26..0872df8d0b15 100644
> > > --- a/drivers/gpu/drm/xe/xe_vm.c
> > > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > > @@ -2663,8 +2663,12 @@ static int vm_bind_ioctl_ops_parse(struct xe_vm *vm, struct drm_gpuva_ops *ops,
> > >   				end = op->base.remap.next->va.addr;
> > >   			if (xe_vma_is_cpu_addr_mirror(old) &&
> > > -			    xe_svm_has_mapping(vm, start, end))
> > > -				return -EBUSY;
> > > +			    xe_svm_has_mapping(vm, start, end)) {
> > > +				if (vops->flags & XE_VMA_OPS_FLAG_MADVISE)
> > > +					xe_svm_unmap_address_range(vm, start, end);
> > > +				else
> > > +					return -EBUSY;
> > > +			}
> > >   			op->remap.start = xe_vma_start(old);
> > >   			op->remap.range = xe_vma_size(old);
> > > -- 
> > > 2.34.1
> > > 
>