[PATCH v2 26/32] drm/xe/madvise: Update migration policy based on preferred location
Matthew Brost
matthew.brost at intel.com
Wed May 14 22:04:44 UTC 2025
On Mon, Apr 07, 2025 at 03:47:13PM +0530, Himal Prasad Ghimiray wrote:
> When the user sets the valid devmem_fd as a preferred location, GPU fault
> will trigger migration to tile of device associated with devmem_fd.
>
> If the user sets an invalid devmem_fd the preferred location is current
> placement only.
>
> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> ---
> drivers/gpu/drm/xe/xe_svm.c | 15 ++++++++++++++-
> drivers/gpu/drm/xe/xe_vm.h | 3 +++
> drivers/gpu/drm/xe/xe_vm_madvise.c | 20 +++++++++++++++++++-
> 3 files changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> index d40111e29bfe..60dfb1bf12ca 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -765,6 +765,12 @@ bool xe_svm_range_needs_migrate_to_vram(struct xe_svm_range *range, struct xe_vm
> return needs_migrate;
> }
>
> +static const u32 region_to_mem_type[] = {
> + XE_PL_TT,
> + XE_PL_VRAM0,
> + XE_PL_VRAM1,
> +};
> +
> /**
> * xe_svm_handle_pagefault() - SVM handle page fault
> * @vm: The VM.
> @@ -796,6 +802,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
> struct xe_tile *tile = gt_to_tile(gt);
> int retry_count = 3;
> ktime_t end = 0;
> + u32 region;
> int err;
>
> lockdep_assert_held_write(&vm->lock);
> @@ -820,7 +827,13 @@ int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
>
> range_debug(range, "PAGE FAULT");
>
> - if (xe_svm_range_needs_migrate_to_vram(range, vma, IS_DGFX(vm->xe))) {
> + region = vma->attr.preferred_loc.devmem_fd;
Mentioned this earlier in the series - you are assiging a devmem_fd to a
region which is a bit confusing.
> +
> + if (xe_svm_range_needs_migrate_to_vram(range, vma, region)) {
> + region = region ? region : 1;
I think the default (region unset) should be the VRAM closest to the GT
of the fault.
> + /* Need rework for multigpu */
> + tile = &vm->xe->tiles[region_to_mem_type[region] - XE_PL_VRAM0];
> +
> err = xe_svm_alloc_vram(vm, tile, range, &ctx);
> if (err) {
> if (retry_count) {
> diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
> index 4e45230b7205..377f62f859b7 100644
> --- a/drivers/gpu/drm/xe/xe_vm.h
> +++ b/drivers/gpu/drm/xe/xe_vm.h
> @@ -220,6 +220,9 @@ int __xe_vm_userptr_needs_repin(struct xe_vm *vm);
>
> int xe_vm_userptr_check_repin(struct xe_vm *vm);
>
> +bool xe_vma_has_preferred_mem_loc(struct xe_vma *vma,
> + u32 *mem_region, u32 *devmem_fd);
> +
> int xe_vm_rebind(struct xe_vm *vm, bool rebind_worker);
> struct dma_fence *xe_vma_rebind(struct xe_vm *vm, struct xe_vma *vma,
> u8 tile_mask);
> diff --git a/drivers/gpu/drm/xe/xe_vm_madvise.c b/drivers/gpu/drm/xe/xe_vm_madvise.c
> index 7e1a95106cb9..f870e8642190 100644
> --- a/drivers/gpu/drm/xe/xe_vm_madvise.c
> +++ b/drivers/gpu/drm/xe/xe_vm_madvise.c
> @@ -61,7 +61,25 @@ static int madvise_preferred_mem_loc(struct xe_device *xe, struct xe_vm *vm,
> struct xe_vma **vmas, int num_vmas,
> struct drm_xe_madvise_ops ops)
> {
> - /* Implementation pending */
> + s32 devmem_fd;
> + u32 migration_policy;
> + int i;
> +
> + xe_assert(vm->xe, ops.type == DRM_XE_VMA_ATTR_PREFERRED_LOC);
> + vm_dbg(&xe->drm, "migration policy = %d, devmem_fd = %d\n",
> + ops.preferred_mem_loc.migration_policy,
> + ops.preferred_mem_loc.devmem_fd);
As mentioned in patch #27, I'm not sure this debug info is all that
useful.
> +
> + devmem_fd = (s32)ops.preferred_mem_loc.devmem_fd;
> + devmem_fd = (devmem_fd < 0) ? 0 : devmem_fd;
> +
Why (devmem_fd < 0) ? 0? I'm not following this.
> + migration_policy = ops.preferred_mem_loc.migration_policy;
> +
Mentioned earlier in the series, I'm confused by migration_policy and it
also looks to be unused unless I'm missing something?
Matt
> + for (i = 0; i < num_vmas; i++) {
> + vmas[i]->attr.preferred_loc.devmem_fd = devmem_fd;
> + vmas[i]->attr.preferred_loc.migration_policy = migration_policy;
> + }
> +
> return 0;
> }
>
> --
> 2.34.1
>
More information about the Intel-xe
mailing list