[PATCH 04/15] drm/xe: Pass down drm_exec context to validation

Matthew Brost matthew.brost at intel.com
Thu Aug 14 19:09:44 UTC 2025


On Thu, Aug 14, 2025 at 09:49:59AM +0200, Thomas Hellström wrote:
> On Wed, 2025-08-13 at 09:42 -0700, Matthew Brost wrote:
> 
> > On Wed, Aug 13, 2025 at 12:51:10PM +0200, Thomas Hellström wrote:
> > 
> > > We want all validation (potential backing store allocation) to be part
> > > of a drm_exec transaction. Therefore add a drm_exec pointer argument
> > > to xe_bo_validate() and ___xe_bo_create_locked(). Upcoming patches
> > > will deal with making all (or nearly all) calls to these functions
> > > part of a drm_exec transaction. In the meantime, define special values
> > > of the drm_exec pointer:
> > > 
> > 
> > 
> > Would the eventual idea be pass the exec further down to TTM?
> 
> 
> Yes. The original series did this, and required multiple changes both to drm_exec and to TTM. Christian had some other ideas, though although the final goal was the same. So it's a task for us and AMD to agree on something here. The TTM object refcount removal series from Christian is a step on the way there.
> 

Ok, I thought that was the idea and wanted to confirm.

Let me look at Christian's series now too.

> 
> >  
> > 
> > > XE_VALIDATION_UNIMPLEMENTED: Implementation of the drm_exec transaction
> > > has not been done yet.
> > > XE_VALIDATION_UNSUPPORTED: Some Middle-layers (dma-buf) doesn't allow
> > > the drm_exec context to be passed down to map_attachment where
> > > validation takes place.
> > 
> > 
> > What is the expected longterm implictation of paths that are
> > UNIMPLEMENTED and UNSUPPORTED?
> 
> 
> IMO Unimplemented should not be allowed moving forward other than for debugging. UNIMPLEMENTED requires a new dma-buf mapping interface with an exec argument. I don't think all peers will support that, though and those won't participate fully in the scheme.
> 

That was my thinking too—once the dma-buf mapping is fixed up, disallow this.

Matt

> 
> 
> > 
> > > XE_VALIDATION_OPT_OUT: May be used only for kunit tests where exhaustive
> > > eviction isn't crucial and the ROI of converting those is very
> > > small.
> > > 
> > > For XE_VALIDATION_UNIMPLEMENTED and XE_VALIDATION_OPT_OUT there is also
> > > a lockdep check that a drm_exec transaction can indeed start at the
> > > location where the macro is expanded. This is to encourage
> > > developers to take this into consideration early in the code
> > > development process.
> > > 
> > > Signed-off-by: Thomas Hellström <[thomas.hellstrom at linux.intel.com](mailto:thomas.hellstrom at linux.intel.com)>
> > > ---
> > >  drivers/gpu/drm/xe/Makefile                   |   1 +
> > >  .../compat-i915-headers/gem/i915_gem_stolen.h |   6 +-
> > >  drivers/gpu/drm/xe/display/xe_fb_pin.c        |   5 +-
> > >  drivers/gpu/drm/xe/tests/xe_bo.c              |  20 +--
> > >  drivers/gpu/drm/xe/tests/xe_dma_buf.c         |  12 +-
> > >  drivers/gpu/drm/xe/tests/xe_migrate.c         |  45 +++---
> > >  drivers/gpu/drm/xe/xe_bo.c                    | 129 +++++++++++++++---
> > >  drivers/gpu/drm/xe/xe_bo.h                    |  20 +--
> > >  drivers/gpu/drm/xe/xe_dma_buf.c               |  19 ++-
> > >  drivers/gpu/drm/xe/xe_exec.c                  |   6 +-
> > >  drivers/gpu/drm/xe/xe_ggtt.c                  |  15 +-
> > >  drivers/gpu/drm/xe/xe_ggtt.h                  |   5 +-
> > >  drivers/gpu/drm/xe/xe_gt_pagefault.c          |   4 +-
> > >  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c    |   6 +-
> > >  drivers/gpu/drm/xe/xe_svm.c                   |   4 +-
> > >  drivers/gpu/drm/xe/xe_validation.c            |  49 +++++++
> > >  drivers/gpu/drm/xe/xe_validation.h            |  69 ++++++++++
> > >  drivers/gpu/drm/xe/xe_vm.c                    |  26 +++-
> > >  drivers/gpu/drm/xe/xe_vm.h                    |  33 ++++-
> > >  drivers/gpu/drm/xe/xe_vm_types.h              |  32 +++--
> > >  20 files changed, 401 insertions(+), 105 deletions(-)
> > >  create mode 100644 drivers/gpu/drm/xe/xe_validation.c
> > >  create mode 100644 drivers/gpu/drm/xe/xe_validation.h
> > > 
> > > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > > index 8e0c3412a757..8ee7d275128d 100644
> > > --- a/drivers/gpu/drm/xe/Makefile
> > > +++ b/drivers/gpu/drm/xe/Makefile
> > > @@ -127,6 +127,7 @@ xe-y += xe_bb.o \
> > >  	xe_tuning.o \
> > >  	xe_uc.o \
> > >  	xe_uc_fw.o \
> > > +	xe_validation.o \
> > >  	xe_vm.o \
> > >  	xe_vram.o \
> > >  	xe_vram_freq.o \
> > > diff --git a/drivers/gpu/drm/xe/compat-i915-headers/gem/i915_gem_stolen.h b/drivers/gpu/drm/xe/compat-i915-headers/gem/i915_gem_stolen.h
> > > index 41d39d67817a..1ce1e9da975b 100644
> > > --- a/drivers/gpu/drm/xe/compat-i915-headers/gem/i915_gem_stolen.h
> > > +++ b/drivers/gpu/drm/xe/compat-i915-headers/gem/i915_gem_stolen.h
> > > @@ -8,6 +8,7 @@
> > >  
> > >  #include "xe_ttm_stolen_mgr.h"
> > >  #include "xe_res_cursor.h"
> > > +#include "xe_validation.h"
> > >  
> > >  struct xe_bo;
> > >  
> > > @@ -20,6 +21,7 @@ static inline int i915_gem_stolen_insert_node_in_range(struct xe_device *xe,
> > >  						       u32 size, u32 align,
> > >  						       u32 start, u32 end)
> > >  {
> > > +	struct drm_exec *exec = XE_VALIDATION_UNIMPLEMENTED;
> > >  	struct xe_bo *bo;
> > >  	int err;
> > >  	u32 flags = XE_BO_FLAG_PINNED | XE_BO_FLAG_STOLEN;
> > > @@ -34,13 +36,13 @@ static inline int i915_gem_stolen_insert_node_in_range(struct xe_device *xe,
> > >  
> > >  	bo = xe_bo_create_locked_range(xe, xe_device_get_root_tile(xe),
> > >  				       NULL, size, start, end,
> > > -				       ttm_bo_type_kernel, flags, 0);
> > > +				       ttm_bo_type_kernel, flags, 0, exec);
> > >  	if (IS_ERR(bo)) {
> > >  		err = PTR_ERR(bo);
> > >  		bo = NULL;
> > >  		return err;
> > >  	}
> > > -	err = xe_bo_pin(bo);
> > > +	err = xe_bo_pin(bo, exec);
> > >  	xe_bo_unlock_vm_held(bo);
> > >  
> > >  	if (err) {
> > > diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c
> > > index f1f8b5ab53ef..4b0748e6fdd6 100644
> > > --- a/drivers/gpu/drm/xe/display/xe_fb_pin.c
> > > +++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c
> > > @@ -281,6 +281,7 @@ static struct i915_vma *__xe_pin_fb_vma(const struct intel_framebuffer *fb,
> > >  	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
> > >  	struct drm_gem_object *obj = intel_fb_bo(&fb->base);
> > >  	struct xe_bo *bo = gem_to_xe_bo(obj);
> > > +	struct drm_exec *exec = XE_VALIDATION_UNIMPLEMENTED;
> > >  	int ret;
> > >  
> > >  	if (!vma)
> > > @@ -313,9 +314,9 @@ static struct i915_vma *__xe_pin_fb_vma(const struct intel_framebuffer *fb,
> > >  		goto err;
> > >  
> > >  	if (IS_DGFX(xe))
> > > -		ret = xe_bo_migrate(bo, XE_PL_VRAM0);
> > > +		ret = xe_bo_migrate(bo, XE_PL_VRAM0, exec);
> > >  	else
> > > -		ret = xe_bo_validate(bo, NULL, true);
> > > +		ret = xe_bo_validate(bo, NULL, true, exec);
> > >  	if (!ret)
> > >  		ttm_bo_pin(&bo->ttm);
> > >  	ttm_bo_unreserve(&bo->ttm);
> > > diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c
> > > index bb469096d072..06ceba6c3c25 100644
> > > --- a/drivers/gpu/drm/xe/tests/xe_bo.c
> > > +++ b/drivers/gpu/drm/xe/tests/xe_bo.c
> > > @@ -23,7 +23,7 @@
> > >  
> > >  static int ccs_test_migrate(struct xe_tile *tile, struct xe_bo *bo,
> > >  			    bool clear, u64 get_val, u64 assign_val,
> > > -			    struct kunit *test)
> > > +			    struct kunit *test, struct drm_exec *exec)
> > >  {
> > >  	struct dma_fence *fence;
> > >  	struct ttm_tt *ttm;
> > > @@ -35,7 +35,7 @@ static int ccs_test_migrate(struct xe_tile *tile, struct xe_bo *bo,
> > >  	u32 offset;
> > >  
> > >  	/* Move bo to VRAM if not already there. */
> > > -	ret = xe_bo_validate(bo, NULL, false);
> > > +	ret = xe_bo_validate(bo, NULL, false, exec);
> > >  	if (ret) {
> > >  		KUNIT_FAIL(test, "Failed to validate bo.\n");
> > >  		return ret;
> > > @@ -60,7 +60,7 @@ static int ccs_test_migrate(struct xe_tile *tile, struct xe_bo *bo,
> > >  	}
> > >  
> > >  	/* Evict to system. CCS data should be copied. */
> > > -	ret = xe_bo_evict(bo);
> > > +	ret = xe_bo_evict(bo, exec);
> > >  	if (ret) {
> > >  		KUNIT_FAIL(test, "Failed to evict bo.\n");
> > >  		return ret;
> > > @@ -132,6 +132,7 @@ static void ccs_test_run_tile(struct xe_device *xe, struct xe_tile *tile,
> > >  
> > >  	/* TODO: Sanity check */
> > >  	unsigned int bo_flags = XE_BO_FLAG_VRAM_IF_DGFX(tile);
> > > +	struct drm_exec *exec = XE_VALIDATION_OPT_OUT;
> > >  
> > >  	if (IS_DGFX(xe))
> > >  		kunit_info(test, "Testing vram id %u\n", tile->id);
> > > @@ -149,18 +150,18 @@ static void ccs_test_run_tile(struct xe_device *xe, struct xe_tile *tile,
> > >  
> > >  	kunit_info(test, "Verifying that CCS data is cleared on creation.\n");
> > >  	ret = ccs_test_migrate(tile, bo, false, 0ULL, 0xdeadbeefdeadbeefULL,
> > > -			       test);
> > > +			       test, exec);
> > >  	if (ret)
> > >  		goto out_unlock;
> > >  
> > >  	kunit_info(test, "Verifying that CCS data survives migration.\n");
> > >  	ret = ccs_test_migrate(tile, bo, false, 0xdeadbeefdeadbeefULL,
> > > -			       0xdeadbeefdeadbeefULL, test);
> > > +			       0xdeadbeefdeadbeefULL, test, exec);
> > >  	if (ret)
> > >  		goto out_unlock;
> > >  
> > >  	kunit_info(test, "Verifying that CCS data can be properly cleared.\n");
> > > -	ret = ccs_test_migrate(tile, bo, true, 0ULL, 0ULL, test);
> > > +	ret = ccs_test_migrate(tile, bo, true, 0ULL, 0ULL, test, exec);
> > >  
> > >  out_unlock:
> > >  	xe_bo_unlock(bo);
> > > @@ -210,6 +211,7 @@ static int evict_test_run_tile(struct xe_device *xe, struct xe_tile *tile, struc
> > >  	struct xe_bo *bo, *external;
> > >  	unsigned int bo_flags = XE_BO_FLAG_VRAM_IF_DGFX(tile);
> > >  	struct xe_vm *vm = xe_migrate_get_vm(xe_device_get_root_tile(xe)->migrate);
> > > +	struct drm_exec *exec = XE_VALIDATION_OPT_OUT;
> > >  	struct xe_gt *__gt;
> > >  	int err, i, id;
> > >  
> > > @@ -236,7 +238,7 @@ static int evict_test_run_tile(struct xe_device *xe, struct xe_tile *tile, struc
> > >  		}
> > >  
> > >  		xe_bo_lock(external, false);
> > > -		err = xe_bo_pin_external(external);
> > > +		err = xe_bo_pin_external(external, exec);
> > >  		xe_bo_unlock(external);
> > >  		if (err) {
> > >  			KUNIT_FAIL(test, "external bo pin err=%pe\n",
> > > @@ -294,7 +296,7 @@ static int evict_test_run_tile(struct xe_device *xe, struct xe_tile *tile, struc
> > >  		if (i) {
> > >  			down_read(&vm->lock);
> > >  			xe_vm_lock(vm, false);
> > > -			err = xe_bo_validate(bo, bo->vm, false);
> > > +			err = xe_bo_validate(bo, bo->vm, false, exec);
> > >  			xe_vm_unlock(vm);
> > >  			up_read(&vm->lock);
> > >  			if (err) {
> > > @@ -303,7 +305,7 @@ static int evict_test_run_tile(struct xe_device *xe, struct xe_tile *tile, struc
> > >  				goto cleanup_all;
> > >  			}
> > >  			xe_bo_lock(external, false);
> > > -			err = xe_bo_validate(external, NULL, false);
> > > +			err = xe_bo_validate(external, NULL, false, exec);
> > >  			xe_bo_unlock(external);
> > >  			if (err) {
> > >  				KUNIT_FAIL(test, "external bo valid err=%pe\n",
> > > diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> > > index cde9530bef8c..965dd3280468 100644
> > > --- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> > > +++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> > > @@ -27,7 +27,8 @@ static bool is_dynamic(struct dma_buf_test_params *params)
> > >  }
> > >  
> > >  static void check_residency(struct kunit *test, struct xe_bo *exported,
> > > -			    struct xe_bo *imported, struct dma_buf *dmabuf)
> > > +			    struct xe_bo *imported, struct dma_buf *dmabuf,
> > > +			    struct drm_exec *exec)
> > >  {
> > >  	struct dma_buf_test_params *params = to_dma_buf_test_params(test->priv);
> > >  	u32 mem_type;
> > > @@ -62,7 +63,7 @@ static void check_residency(struct kunit *test, struct xe_bo *exported,
> > >  	 * importer is on a different device. If they're on the same device,
> > >  	 * the exporter and the importer should be the same bo.
> > >  	 */
> > > -	ret = xe_bo_evict(exported);
> > > +	ret = xe_bo_evict(exported, exec);
> > >  	if (ret) {
> > >  		if (ret != -EINTR && ret != -ERESTARTSYS)
> > >  			KUNIT_FAIL(test, "Evicting exporter failed with err=%d.\n",
> > > @@ -77,7 +78,7 @@ static void check_residency(struct kunit *test, struct xe_bo *exported,
> > >  	}
> > >  
> > >  	/* Re-validate the importer. This should move also exporter in. */
> > > -	ret = xe_bo_validate(imported, NULL, false);
> > > +	ret = xe_bo_validate(imported, NULL, false, exec);
> > >  	if (ret) {
> > >  		if (ret != -EINTR && ret != -ERESTARTSYS)
> > >  			KUNIT_FAIL(test, "Validating importer failed with err=%d.\n",
> > > @@ -150,11 +151,12 @@ static void xe_test_dmabuf_import_same_driver(struct xe_device *xe)
> > >  			KUNIT_FAIL(test,
> > >  				   "xe_gem_prime_import() succeeded when it shouldn't have\n");
> > >  		} else {
> > > +			struct drm_exec *exec = XE_VALIDATION_OPT_OUT;
> > >  			int err;
> > >  
> > >  			/* Is everything where we expect it to be? */
> > >  			xe_bo_lock(import_bo, false);
> > > -			err = xe_bo_validate(import_bo, NULL, false);
> > > +			err = xe_bo_validate(import_bo, NULL, false, exec);
> > >  
> > >  			/* Pinning in VRAM is not allowed. */
> > >  			if (!is_dynamic(params) &&
> > > @@ -167,7 +169,7 @@ static void xe_test_dmabuf_import_same_driver(struct xe_device *xe)
> > >  						  err == -ERESTARTSYS);
> > >  
> > >  			if (!err)
> > > -				check_residency(test, bo, import_bo, dmabuf);
> > > +				check_residency(test, bo, import_bo, dmabuf, exec);
> > >  			xe_bo_unlock(import_bo);
> > >  		}
> > >  		drm_gem_object_put(import);
> > > diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c
> > > index edd1e701aa1c..dfb445d09759 100644
> > > --- a/drivers/gpu/drm/xe/tests/xe_migrate.c
> > > +++ b/drivers/gpu/drm/xe/tests/xe_migrate.c
> > > @@ -70,7 +70,7 @@ static int run_sanity_job(struct xe_migrate *m, struct xe_device *xe,
> > >  		} } while (0)
> > >  
> > >  static void test_copy(struct xe_migrate *m, struct xe_bo *bo,
> > > -		      struct kunit *test, u32 region)
> > > +		      struct kunit *test, u32 region, struct drm_exec *exec)
> > >  {
> > >  	struct xe_device *xe = tile_to_xe(m->tile);
> > >  	u64 retval, expected = 0;
> > > @@ -84,14 +84,15 @@ static void test_copy(struct xe_migrate *m, struct xe_bo *bo,
> > >  						   ttm_bo_type_kernel,
> > >  						   region |
> > >  						   XE_BO_FLAG_NEEDS_CPU_ACCESS |
> > > -						   XE_BO_FLAG_PINNED);
> > > +						   XE_BO_FLAG_PINNED,
> > > +						   exec);
> > >  	if (IS_ERR(remote)) {
> > >  		KUNIT_FAIL(test, "Failed to allocate remote bo for %s: %pe\n",
> > >  			   str, remote);
> > >  		return;
> > >  	}
> > >  
> > > -	err = xe_bo_validate(remote, NULL, false);
> > > +	err = xe_bo_validate(remote, NULL, false, exec);
> > >  	if (err) {
> > >  		KUNIT_FAIL(test, "Failed to validate system bo for %s: %i\n",
> > >  			   str, err);
> > > @@ -161,13 +162,13 @@ static void test_copy(struct xe_migrate *m, struct xe_bo *bo,
> > >  }
> > >  
> > >  static void test_copy_sysmem(struct xe_migrate *m, struct xe_bo *bo,
> > > -			     struct kunit *test)
> > > +			     struct drm_exec *exec, struct kunit *test)
> > >  {
> > > -	test_copy(m, bo, test, XE_BO_FLAG_SYSTEM);
> > > +	test_copy(m, bo, test, XE_BO_FLAG_SYSTEM, exec);
> > >  }
> > >  
> > >  static void test_copy_vram(struct xe_migrate *m, struct xe_bo *bo,
> > > -			   struct kunit *test)
> > > +			   struct drm_exec *exec, struct kunit *test)
> > >  {
> > >  	u32 region;
> > >  
> > > @@ -178,10 +179,11 @@ static void test_copy_vram(struct xe_migrate *m, struct xe_bo *bo,
> > >  		region = XE_BO_FLAG_VRAM1;
> > >  	else
> > >  		region = XE_BO_FLAG_VRAM0;
> > > -	test_copy(m, bo, test, region);
> > > +	test_copy(m, bo, test, region, exec);
> > >  }
> > >  
> > > -static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
> > > +static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test,
> > > +				   struct drm_exec *exec)
> > >  {
> > >  	struct xe_tile *tile = m->tile;
> > >  	struct xe_device *xe = tile_to_xe(tile);
> > > @@ -290,10 +292,10 @@ static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
> > >  	check(retval, expected, "Command clear small last value", test);
> > >  
> > >  	kunit_info(test, "Copying small buffer object to system\n");
> > > -	test_copy_sysmem(m, tiny, test);
> > > +	test_copy_sysmem(m, tiny, exec, test);
> > >  	if (xe->info.tile_count > 1) {
> > >  		kunit_info(test, "Copying small buffer object to other vram\n");
> > > -		test_copy_vram(m, tiny, test);
> > > +		test_copy_vram(m, tiny, exec, test);
> > >  	}
> > >  
> > >  	/* Clear a big bo */
> > > @@ -312,10 +314,10 @@ static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
> > >  	check(retval, expected, "Command clear big last value", test);
> > >  
> > >  	kunit_info(test, "Copying big buffer object to system\n");
> > > -	test_copy_sysmem(m, big, test);
> > > +	test_copy_sysmem(m, big, exec, test);
> > >  	if (xe->info.tile_count > 1) {
> > >  		kunit_info(test, "Copying big buffer object to other vram\n");
> > > -		test_copy_vram(m, big, test);
> > > +		test_copy_vram(m, big, exec, test);
> > >  	}
> > >  
> > >  out:
> > > @@ -343,10 +345,11 @@ static int migrate_test_run_device(struct xe_device *xe)
> > >  
> > >  	for_each_tile(tile, xe, id) {
> > >  		struct xe_migrate *m = tile->migrate;
> > > +		struct drm_exec *exec = XE_VALIDATION_OPT_OUT;
> > >  
> > >  		kunit_info(test, "Testing tile id %d.\n", id);
> > >  		xe_vm_lock(m->q->vm, false);
> > > -		xe_migrate_sanity_test(m, test);
> > > +		xe_migrate_sanity_test(m, test, exec);
> > >  		xe_vm_unlock(m->q->vm);
> > >  	}
> > >  
> > > @@ -490,7 +493,7 @@ static struct dma_fence *blt_copy(struct xe_tile *tile,
> > >  
> > >  static void test_migrate(struct xe_device *xe, struct xe_tile *tile,
> > >  			 struct xe_bo *sys_bo, struct xe_bo *vram_bo, struct xe_bo *ccs_bo,
> > > -			 struct kunit *test)
> > > +			 struct drm_exec *exec, struct kunit *test)
> > >  {
> > >  	struct dma_fence *fence;
> > >  	u64 expected, retval;
> > > @@ -509,7 +512,7 @@ static void test_migrate(struct xe_device *xe, struct xe_tile *tile,
> > >  	dma_fence_put(fence);
> > >  
> > >  	kunit_info(test, "Evict vram buffer object\n");
> > > -	ret = xe_bo_evict(vram_bo);
> > > +	ret = xe_bo_evict(vram_bo, exec);
> > >  	if (ret) {
> > >  		KUNIT_FAIL(test, "Failed to evict bo.\n");
> > >  		return;
> > > @@ -538,7 +541,7 @@ static void test_migrate(struct xe_device *xe, struct xe_tile *tile,
> > >  	dma_fence_put(fence);
> > >  
> > >  	kunit_info(test, "Restore vram buffer object\n");
> > > -	ret = xe_bo_validate(vram_bo, NULL, false);
> > > +	ret = xe_bo_validate(vram_bo, NULL, false, exec);
> > >  	if (ret) {
> > >  		KUNIT_FAIL(test, "Failed to validate vram bo for: %li\n", ret);
> > >  		return;
> > > @@ -636,6 +639,7 @@ static void validate_ccs_test_run_tile(struct xe_device *xe, struct xe_tile *til
> > >  {
> > >  	struct xe_bo *sys_bo, *vram_bo = NULL, *ccs_bo = NULL;
> > >  	unsigned int bo_flags = XE_BO_FLAG_VRAM_IF_DGFX(tile);
> > > +	struct drm_exec *exec;
> > >  	long ret;
> > >  
> > >  	sys_bo = xe_bo_create_user(xe, NULL, NULL, SZ_4M,
> > > @@ -650,8 +654,9 @@ static void validate_ccs_test_run_tile(struct xe_device *xe, struct xe_tile *til
> > >  		return;
> > >  	}
> > >  
> > > +	exec = XE_VALIDATION_OPT_OUT;
> > >  	xe_bo_lock(sys_bo, false);
> > > -	ret = xe_bo_validate(sys_bo, NULL, false);
> > > +	ret = xe_bo_validate(sys_bo, NULL, false, exec);
> > >  	if (ret) {
> > >  		KUNIT_FAIL(test, "Failed to validate system bo for: %li\n", ret);
> > >  		goto free_sysbo;
> > > @@ -676,7 +681,7 @@ static void validate_ccs_test_run_tile(struct xe_device *xe, struct xe_tile *til
> > >  	}
> > >  
> > >  	xe_bo_lock(ccs_bo, false);
> > > -	ret = xe_bo_validate(ccs_bo, NULL, false);
> > > +	ret = xe_bo_validate(ccs_bo, NULL, false, exec);
> > >  	if (ret) {
> > >  		KUNIT_FAIL(test, "Failed to validate system bo for: %li\n", ret);
> > >  		goto free_ccsbo;
> > > @@ -700,7 +705,7 @@ static void validate_ccs_test_run_tile(struct xe_device *xe, struct xe_tile *til
> > >  	}
> > >  
> > >  	xe_bo_lock(vram_bo, false);
> > > -	ret = xe_bo_validate(vram_bo, NULL, false);
> > > +	ret = xe_bo_validate(vram_bo, NULL, false, exec);
> > >  	if (ret) {
> > >  		KUNIT_FAIL(test, "Failed to validate vram bo for: %li\n", ret);
> > >  		goto free_vrambo;
> > > @@ -713,7 +718,7 @@ static void validate_ccs_test_run_tile(struct xe_device *xe, struct xe_tile *til
> > >  	}
> > >  
> > >  	test_clear(xe, tile, sys_bo, vram_bo, test);
> > > -	test_migrate(xe, tile, sys_bo, vram_bo, ccs_bo, test);
> > > +	test_migrate(xe, tile, sys_bo, vram_bo, ccs_bo, exec, test);
> > >  	xe_bo_unlock(vram_bo);
> > >  
> > >  	xe_bo_lock(vram_bo, false);
> > > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> > > index 11eaf3b06766..e71addf51ed0 100644
> > > --- a/drivers/gpu/drm/xe/xe_bo.c
> > > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > > @@ -1139,6 +1139,7 @@ long xe_bo_shrink(struct ttm_operation_ctx *ctx, struct ttm_buffer_object *bo,
> > >  int xe_bo_notifier_prepare_pinned(struct xe_bo *bo)
> > >  {
> > >  	struct xe_device *xe = ttm_to_xe_device(bo->ttm.bdev);
> > > +	struct drm_exec *exec = XE_VALIDATION_UNIMPLEMENTED;
> > >  	struct xe_bo *backup;
> > >  	int ret = 0;
> > >  
> > > @@ -1163,7 +1164,7 @@ int xe_bo_notifier_prepare_pinned(struct xe_bo *bo)
> > >  	backup = ___xe_bo_create_locked(xe, NULL, NULL, bo->ttm.base.resv, NULL, xe_bo_size(bo),
> > >  					DRM_XE_GEM_CPU_CACHING_WB, ttm_bo_type_kernel,
> > >  					XE_BO_FLAG_SYSTEM | XE_BO_FLAG_NEEDS_CPU_ACCESS |
> > > -					XE_BO_FLAG_PINNED);
> > > +					XE_BO_FLAG_PINNED, exec);
> > >  	if (IS_ERR(backup)) {
> > >  		ret = PTR_ERR(backup);
> > >  		goto out_unlock_bo;
> > > @@ -1214,6 +1215,7 @@ int xe_bo_notifier_unprepare_pinned(struct xe_bo *bo)
> > >  int xe_bo_evict_pinned(struct xe_bo *bo)
> > >  {
> > >  	struct xe_device *xe = ttm_to_xe_device(bo->ttm.bdev);
> > > +	struct drm_exec *exec = XE_VALIDATION_UNIMPLEMENTED;
> > >  	struct xe_bo *backup = bo->backup_obj;
> > >  	bool backup_created = false;
> > >  	bool unmap = false;
> > > @@ -1242,7 +1244,7 @@ int xe_bo_evict_pinned(struct xe_bo *bo)
> > >  						NULL, xe_bo_size(bo),
> > >  						DRM_XE_GEM_CPU_CACHING_WB, ttm_bo_type_kernel,
> > >  						XE_BO_FLAG_SYSTEM | XE_BO_FLAG_NEEDS_CPU_ACCESS |
> > > -						XE_BO_FLAG_PINNED);
> > > +						XE_BO_FLAG_PINNED, exec);
> > >  		if (IS_ERR(backup)) {
> > >  			ret = PTR_ERR(backup);
> > >  			goto out_unlock_bo;
> > > @@ -1718,12 +1720,14 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf)
> > >  	struct xe_device *xe = to_xe_device(ddev);
> > >  	struct xe_bo *bo = ttm_to_xe_bo(tbo);
> > >  	bool needs_rpm = bo->flags & XE_BO_FLAG_VRAM_MASK;
> > > +	struct drm_exec *exec;
> > >  	vm_fault_t ret;
> > >  	int idx;
> > >  
> > >  	if (needs_rpm)
> > >  		xe_pm_runtime_get(xe);
> > >  
> > > +	exec = XE_VALIDATION_UNIMPLEMENTED;
> > >  	ret = ttm_bo_vm_reserve(tbo, vmf);
> > >  	if (ret)
> > >  		goto out;
> > > @@ -1731,6 +1735,7 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf)
> > >  	if (drm_dev_enter(ddev, &idx)) {
> > >  		trace_xe_bo_cpu_fault(bo);
> > >  
> > > +		xe_validation_assert_exec(xe, exec, &tbo->base);
> > >  		ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
> > >  					       TTM_BO_VM_NUM_PREFAULT);
> > >  		drm_dev_exit(idx);
> > > @@ -1850,11 +1855,32 @@ void xe_bo_free(struct xe_bo *bo)
> > >  	kfree(bo);
> > >  }
> > >  
> > > +/**
> > > + * ___xe_bo_create_locked() - Initialize or create an xe_bo.
> > > + * @xe: The xe device.
> > > + * @bo: An already allocated buffer object or NULL
> > > + * if the function should allocate a new one.
> > > + * @tile: The tile to select for migration of this bo, and the tile used for
> > > + * GGTT binding if any. Only to be non-NULL for ttm_bo_type_kernel bos.
> > > + * @resv: Pointer to a locked shared reservation object to use fo this bo,
> > > + * or NULL for the xe_bo to use its own.
> > > + * @bulk: The bulk move to use for LRU bumping, or NULL for external bos.
> > > + * @size: The storage size to use for the bo.
> > > + * @cpu_caching: The cpu caching used for system memory backing store.
> > > + * @type: The TTM buffer object type.
> > > + * @flags: XE_BO_FLAG_ flags.
> > > + * @exec: The drm_exec transaction to use for exhaustive eviction.
> > > + *
> > > + * Initialize or create an xe buffer object. On failure, any allocated buffer
> > > + * object passed in @bo will have been unreferenced.
> > > + *
> > > + * Return: The buffer object on success. Negative error pointer on failure.
> > > + */
> > >  struct xe_bo *___xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
> > >  				     struct xe_tile *tile, struct dma_resv *resv,
> > >  				     struct ttm_lru_bulk_move *bulk, size_t size,
> > >  				     u16 cpu_caching, enum ttm_bo_type type,
> > > -				     u32 flags)
> > > +				     u32 flags, struct drm_exec *exec)
> > >  {
> > >  	struct ttm_operation_ctx ctx = {
> > >  		.interruptible = true,
> > > @@ -1923,6 +1949,7 @@ struct xe_bo *___xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
> > >  		ctx.resv = resv;
> > >  	}
> > >  
> > > +	xe_validation_assert_exec(xe, exec, &bo->ttm.base);
> > >  	if (!(flags & XE_BO_FLAG_FIXED_PLACEMENT)) {
> > >  		err = __xe_bo_placement_for_flags(xe, bo, bo->flags);
> > >  		if (WARN_ON(err)) {
> > > @@ -2024,7 +2051,7 @@ __xe_bo_create_locked(struct xe_device *xe,
> > >  		      struct xe_tile *tile, struct xe_vm *vm,
> > >  		      size_t size, u64 start, u64 end,
> > >  		      u16 cpu_caching, enum ttm_bo_type type, u32 flags,
> > > -		      u64 alignment)
> > > +		      u64 alignment, struct drm_exec *exec)
> > >  {
> > >  	struct xe_bo *bo = NULL;
> > >  	int err;
> > > @@ -2049,7 +2076,7 @@ __xe_bo_create_locked(struct xe_device *xe,
> > >  				    vm && !xe_vm_in_fault_mode(vm) &&
> > >  				    flags & XE_BO_FLAG_USER ?
> > >  				    &vm->lru_bulk_move : NULL, size,
> > > -				    cpu_caching, type, flags);
> > > +				    cpu_caching, type, flags, exec);
> > >  	if (IS_ERR(bo))
> > >  		return bo;
> > >  
> > > @@ -2083,9 +2110,10 @@ __xe_bo_create_locked(struct xe_device *xe,
> > >  
> > >  			if (flags & XE_BO_FLAG_FIXED_PLACEMENT) {
> > >  				err = xe_ggtt_insert_bo_at(t->mem.ggtt, bo,
> > > -							   start + xe_bo_size(bo), U64_MAX);
> > > +							   start + xe_bo_size(bo), U64_MAX,
> > > +							   exec);
> > >  			} else {
> > > -				err = xe_ggtt_insert_bo(t->mem.ggtt, bo);
> > > +				err = xe_ggtt_insert_bo(t->mem.ggtt, bo, exec);
> > >  			}
> > >  			if (err)
> > >  				goto err_unlock_put_bo;
> > > @@ -2102,22 +2130,59 @@ __xe_bo_create_locked(struct xe_device *xe,
> > >  	return ERR_PTR(err);
> > >  }
> > >  
> > > +/**
> > > + * xe_bo_create_locked_range() - Create a BO with range- and alignment options
> > > + * @xe: The xe device.
> > > + * @tile: The tile to select for migration of this bo, and the tile used for
> > > + * GGTT binding if any. Only to be non-NULL for ttm_bo_type_kernel bos.
> > > + * @vm: The local vm or NULL for external objects.
> > > + * @size: The storage size to use for the bo.
> > > + * @start: Start of fixed VRAM range or 0.
> > > + * @end: End of fixed VRAM range or ~0ULL.
> > > + * @type: The TTM buffer object type.
> > > + * @flags: XE_BO_FLAG_ flags.
> > > + * @alignment: For GGTT buffer objects, the minimum GGTT alignment.
> > > + * @exec: The drm_exec transaction to use for exhaustive eviction.
> > > + *
> > > + * Create an Xe BO with range- and alignment options. If @start and @end indicate
> > > + * a fixed VRAM range, this must be a ttm_bo_type_kernel bo with VRAM placement
> > > + * only. The @alignment parameter can be used for GGTT alignment.
> > > + *
> > > + * Return: The buffer object on success. Negative error pointer on failure.
> > > + */
> > >  struct xe_bo *
> > >  xe_bo_create_locked_range(struct xe_device *xe,
> > >  			  struct xe_tile *tile, struct xe_vm *vm,
> > >  			  size_t size, u64 start, u64 end,
> > > -			  enum ttm_bo_type type, u32 flags, u64 alignment)
> > > +			  enum ttm_bo_type type, u32 flags, u64 alignment,
> > > +			  struct drm_exec *exec)
> > >  {
> > >  	return __xe_bo_create_locked(xe, tile, vm, size, start, end, 0, type,
> > > -				     flags, alignment);
> > > +				     flags, alignment, exec);
> > >  }
> > >  
> > > +/**
> > > + * xe_bo_create_locked() - Create a BO
> > > + * @xe: The xe device.
> > > + * @tile: The tile to select for migration of this bo, and the tile used for
> > > + * GGTT binding if any. Only to be non-NULL for ttm_bo_type_kernel bos.
> > > + * @vm: The local vm or NULL for external objects.
> > > + * @size: The storage size to use for the bo.
> > > + * @type: The TTM buffer object type.
> > > + * @flags: XE_BO_FLAG_ flags.
> > > + * @exec: The drm_exec transaction to use for exhaustive eviction.
> > > + *
> > > + * Create a locked xe BO with no range- nor alignment restrictions.
> > > + *
> > > + * Return: The buffer object on success. Negative error pointer on failure.
> > > + */
> > >  struct xe_bo *xe_bo_create_locked(struct xe_device *xe, struct xe_tile *tile,
> > >  				  struct xe_vm *vm, size_t size,
> > > -				  enum ttm_bo_type type, u32 flags)
> > > +				  enum ttm_bo_type type, u32 flags,
> > > +				  struct drm_exec *exec)
> > >  {
> > >  	return __xe_bo_create_locked(xe, tile, vm, size, 0, ~0ULL, 0, type,
> > > -				     flags, 0);
> > > +				     flags, 0, exec);
> > >  }
> > >  
> > >  struct xe_bo *xe_bo_create_user(struct xe_device *xe, struct xe_tile *tile,
> > > @@ -2125,9 +2190,10 @@ struct xe_bo *xe_bo_create_user(struct xe_device *xe, struct xe_tile *tile,
> > >  				u16 cpu_caching,
> > >  				u32 flags)
> > >  {
> > > +	struct drm_exec *exec = vm ? xe_vm_validation_exec(vm) : XE_VALIDATION_UNIMPLEMENTED;
> > >  	struct xe_bo *bo = __xe_bo_create_locked(xe, tile, vm, size, 0, ~0ULL,
> > >  						 cpu_caching, ttm_bo_type_device,
> > > -						 flags | XE_BO_FLAG_USER, 0);
> > > +						 flags | XE_BO_FLAG_USER, 0, exec);
> > >  	if (!IS_ERR(bo))
> > >  		xe_bo_unlock_vm_held(bo);
> > >  
> > > @@ -2138,7 +2204,8 @@ struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_tile *tile,
> > >  			   struct xe_vm *vm, size_t size,
> > >  			   enum ttm_bo_type type, u32 flags)
> > >  {
> > > -	struct xe_bo *bo = xe_bo_create_locked(xe, tile, vm, size, type, flags);
> > > +	struct drm_exec *exec = vm ? xe_vm_validation_exec(vm) : XE_VALIDATION_UNIMPLEMENTED;
> > > +	struct xe_bo *bo = xe_bo_create_locked(xe, tile, vm, size, type, flags, exec);
> > >  
> > >  	if (!IS_ERR(bo))
> > >  		xe_bo_unlock_vm_held(bo);
> > > @@ -2166,6 +2233,7 @@ struct xe_bo *xe_bo_create_pin_map_at_aligned(struct xe_device *xe,
> > >  	int err;
> > >  	u64 start = offset == ~0ull ? 0 : offset;
> > >  	u64 end = offset == ~0ull ? offset : start + size;
> > > +	struct drm_exec *exec = vm ? xe_vm_validation_exec(vm) : XE_VALIDATION_UNIMPLEMENTED;
> > >  
> > >  	if (flags & XE_BO_FLAG_STOLEN &&
> > >  	    xe_ttm_stolen_cpu_access_needs_ggtt(xe))
> > > @@ -2173,11 +2241,11 @@ struct xe_bo *xe_bo_create_pin_map_at_aligned(struct xe_device *xe,
> > >  
> > >  	bo = xe_bo_create_locked_range(xe, tile, vm, size, start, end, type,
> > >  				       flags | XE_BO_FLAG_NEEDS_CPU_ACCESS | XE_BO_FLAG_PINNED,
> > > -				       alignment);
> > > +				       alignment, exec);
> > >  	if (IS_ERR(bo))
> > >  		return bo;
> > >  
> > > -	err = xe_bo_pin(bo);
> > > +	err = xe_bo_pin(bo, exec);
> > >  	if (err)
> > >  		goto err_put;
> > >  
> > > @@ -2299,6 +2367,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
> > >  /**
> > >   * xe_bo_pin_external - pin an external BO
> > >   * @bo: buffer object to be pinned
> > > + * @exec: The drm_exec transaction to use for exhaustive eviction.
> > >   *
> > >   * Pin an external (not tied to a VM, can be exported via dma-buf / prime FD)
> > >   * BO. Unique call compared to xe_bo_pin as this function has it own set of
> > > @@ -2306,7 +2375,7 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res)
> > >   *
> > >   * Returns 0 for success, negative error code otherwise.
> > >   */
> > > -int xe_bo_pin_external(struct xe_bo *bo)
> > > +int xe_bo_pin_external(struct xe_bo *bo, struct drm_exec *exec)
> > >  {
> > >  	struct xe_device *xe = xe_bo_device(bo);
> > >  	int err;
> > > @@ -2315,7 +2384,7 @@ int xe_bo_pin_external(struct xe_bo *bo)
> > >  	xe_assert(xe, xe_bo_is_user(bo));
> > >  
> > >  	if (!xe_bo_is_pinned(bo)) {
> > > -		err = xe_bo_validate(bo, NULL, false);
> > > +		err = xe_bo_validate(bo, NULL, false, exec);
> > >  		if (err)
> > >  			return err;
> > >  
> > > @@ -2337,7 +2406,17 @@ int xe_bo_pin_external(struct xe_bo *bo)
> > >  	return 0;
> > >  }
> > >  
> > > -int xe_bo_pin(struct xe_bo *bo)
> > > +/**
> > > + * xe_bo_pin() - Pin a kernel bo after potentially migrating it
> > > + * @bo: The kernel bo to pin.
> > > + * @exec: The drm_exec transaction to use for exhaustive eviction.
> > > + *
> > > + * Attempts to migrate a bo to @bo->placement. If that succeeds,
> > > + * pins the bo.
> > > + *
> > > + * Return: %0 on success, negative error code on migration failure.
> > > + */
> > > +int xe_bo_pin(struct xe_bo *bo, struct drm_exec *exec)
> > >  {
> > >  	struct ttm_place *place = &bo->placements[0];
> > >  	struct xe_device *xe = xe_bo_device(bo);
> > > @@ -2359,7 +2438,7 @@ int xe_bo_pin(struct xe_bo *bo)
> > >  	/* We only expect at most 1 pin */
> > >  	xe_assert(xe, !xe_bo_is_pinned(bo));
> > >  
> > > -	err = xe_bo_validate(bo, NULL, false);
> > > +	err = xe_bo_validate(bo, NULL, false, exec);
> > >  	if (err)
> > >  		return err;
> > >  
> > > @@ -2452,6 +2531,7 @@ void xe_bo_unpin(struct xe_bo *bo)
> > >   *      NULL. Used together with @allow_res_evict.
> > >   * @allow_res_evict: Whether it's allowed to evict bos sharing @vm's
> > >   *                   reservation object.
> > > + * @exec: The drm_exec transaction to use for exhaustive eviction.
> > >   *
> > >   * Make sure the bo is in allowed placement, migrating it if necessary. If
> > >   * needed, other bos will be evicted. If bos selected for eviction shares
> > > @@ -2461,7 +2541,8 @@ void xe_bo_unpin(struct xe_bo *bo)
> > >   * Return: 0 on success, negative error code on failure. May return
> > >   * -EINTR or -ERESTARTSYS if internal waits are interrupted by a signal.
> > >   */
> > > -int xe_bo_validate(struct xe_bo *bo, struct xe_vm *vm, bool allow_res_evict)
> > > +int xe_bo_validate(struct xe_bo *bo, struct xe_vm *vm, bool allow_res_evict,
> > > +		   struct drm_exec *exec)
> > >  {
> > >  	struct ttm_operation_ctx ctx = {
> > >  		.interruptible = true,
> > > @@ -2480,6 +2561,7 @@ int xe_bo_validate(struct xe_bo *bo, struct xe_vm *vm, bool allow_res_evict)
> > >  
> > >  	xe_vm_set_validating(vm, allow_res_evict);
> > >  	trace_xe_bo_validate(bo);
> > > +	xe_validation_assert_exec(xe_bo_device(bo), exec, &bo->ttm.base);
> > >  	ret = ttm_bo_validate(&bo->ttm, &bo->placement, &ctx);
> > >  	xe_vm_clear_validating(vm, allow_res_evict);
> > >  
> > > @@ -2917,6 +2999,7 @@ static void xe_place_from_ttm_type(u32 mem_type, struct ttm_place *place)
> > >   * xe_bo_migrate - Migrate an object to the desired region id
> > >   * @bo: The buffer object to migrate.
> > >   * @mem_type: The TTM region type to migrate to.
> > > + * @exec: The drm_exec transaction to use for exhaustive eviction.
> > >   *
> > >   * Attempt to migrate the buffer object to the desired memory region. The
> > >   * buffer object may not be pinned, and must be locked.
> > > @@ -2928,7 +3011,7 @@ static void xe_place_from_ttm_type(u32 mem_type, struct ttm_place *place)
> > >   * Return: 0 on success. Negative error code on failure. In particular may
> > >   * return -EINTR or -ERESTARTSYS if signal pending.
> > >   */
> > > -int xe_bo_migrate(struct xe_bo *bo, u32 mem_type)
> > > +int xe_bo_migrate(struct xe_bo *bo, u32 mem_type, struct drm_exec *exec)
> > >  {
> > >  	struct xe_device *xe = ttm_to_xe_device(bo->ttm.bdev);
> > >  	struct ttm_operation_ctx ctx = {
> > > @@ -2966,19 +3049,21 @@ int xe_bo_migrate(struct xe_bo *bo, u32 mem_type)
> > >  		add_vram(xe, bo, &requested, bo->flags, mem_type, &c);
> > >  	}
> > >  
> > > +	xe_validation_assert_exec(xe_bo_device(bo), exec, &bo->ttm.base);
> > >  	return ttm_bo_validate(&bo->ttm, &placement, &ctx);
> > >  }
> > >  
> > >  /**
> > >   * xe_bo_evict - Evict an object to evict placement
> > >   * @bo: The buffer object to migrate.
> > > + * @exec: The drm_exec transaction to use for exhaustive eviction.
> > >   *
> > >   * On successful completion, the object memory will be moved to evict
> > >   * placement. This function blocks until the object has been fully moved.
> > >   *
> > >   * Return: 0 on success. Negative error code on failure.
> > >   */
> > > -int xe_bo_evict(struct xe_bo *bo)
> > > +int xe_bo_evict(struct xe_bo *bo, struct drm_exec *exec)
> > >  {
> > >  	struct ttm_operation_ctx ctx = {
> > >  		.interruptible = false,
> > > diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h
> > > index 8cce413b5235..b1b6cb622d71 100644
> > > --- a/drivers/gpu/drm/xe/xe_bo.h
> > > +++ b/drivers/gpu/drm/xe/xe_bo.h
> > > @@ -10,6 +10,7 @@
> > >  
> > >  #include "xe_bo_types.h"
> > >  #include "xe_macros.h"
> > > +#include "xe_validation.h"
> > >  #include "xe_vm_types.h"
> > >  #include "xe_vm.h"
> > >  #include "xe_vram_types.h"
> > > @@ -92,15 +93,17 @@ struct xe_bo *___xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
> > >  				     struct xe_tile *tile, struct dma_resv *resv,
> > >  				     struct ttm_lru_bulk_move *bulk, size_t size,
> > >  				     u16 cpu_caching, enum ttm_bo_type type,
> > > -				     u32 flags);
> > > +				     u32 flags, struct drm_exec *exec);
> > >  struct xe_bo *
> > >  xe_bo_create_locked_range(struct xe_device *xe,
> > >  			  struct xe_tile *tile, struct xe_vm *vm,
> > >  			  size_t size, u64 start, u64 end,
> > > -			  enum ttm_bo_type type, u32 flags, u64 alignment);
> > > +			  enum ttm_bo_type type, u32 flags, u64 alignment,
> > > +			  struct drm_exec *exec);
> > >  struct xe_bo *xe_bo_create_locked(struct xe_device *xe, struct xe_tile *tile,
> > >  				  struct xe_vm *vm, size_t size,
> > > -				  enum ttm_bo_type type, u32 flags);
> > > +				  enum ttm_bo_type type, u32 flags,
> > > +				  struct drm_exec *exec);
> > >  struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_tile *tile,
> > >  			   struct xe_vm *vm, size_t size,
> > >  			   enum ttm_bo_type type, u32 flags);
> > > @@ -200,11 +203,12 @@ static inline void xe_bo_unlock_vm_held(struct xe_bo *bo)
> > >  	}
> > >  }
> > >  
> > > -int xe_bo_pin_external(struct xe_bo *bo);
> > > -int xe_bo_pin(struct xe_bo *bo);
> > > +int xe_bo_pin_external(struct xe_bo *bo, struct drm_exec *exec);
> > > +int xe_bo_pin(struct xe_bo *bo, struct drm_exec *exec);
> > >  void xe_bo_unpin_external(struct xe_bo *bo);
> > >  void xe_bo_unpin(struct xe_bo *bo);
> > > -int xe_bo_validate(struct xe_bo *bo, struct xe_vm *vm, bool allow_res_evict);
> > > +int xe_bo_validate(struct xe_bo *bo, struct xe_vm *vm, bool allow_res_evict,
> > > +		   struct drm_exec *exec);
> > >  
> > >  static inline bool xe_bo_is_pinned(struct xe_bo *bo)
> > >  {
> > > @@ -285,8 +289,8 @@ uint64_t vram_region_gpu_offset(struct ttm_resource *res);
> > >  
> > >  bool xe_bo_can_migrate(struct xe_bo *bo, u32 mem_type);
> > >  
> > > -int xe_bo_migrate(struct xe_bo *bo, u32 mem_type);
> > > -int xe_bo_evict(struct xe_bo *bo);
> > > +int xe_bo_migrate(struct xe_bo *bo, u32 mem_type, struct drm_exec *exec);
> > > +int xe_bo_evict(struct xe_bo *bo, struct drm_exec *exec);
> > >  
> > >  int xe_bo_evict_pinned(struct xe_bo *bo);
> > >  int xe_bo_notifier_prepare_pinned(struct xe_bo *bo);
> > > diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c
> > > index 346f857f3837..78a827d4e726 100644
> > > --- a/drivers/gpu/drm/xe/xe_dma_buf.c
> > > +++ b/drivers/gpu/drm/xe/xe_dma_buf.c
> > > @@ -51,6 +51,7 @@ static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
> > >  	struct drm_gem_object *obj = attach->dmabuf->priv;
> > >  	struct xe_bo *bo = gem_to_xe_bo(obj);
> > >  	struct xe_device *xe = xe_bo_device(bo);
> > > +	struct drm_exec *exec = XE_VALIDATION_UNSUPPORTED;
> > >  	int ret;
> > >  
> > >  	/*
> > > @@ -63,7 +64,7 @@ static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
> > >  		return -EINVAL;
> > >  	}
> > >  
> > > -	ret = xe_bo_migrate(bo, XE_PL_TT);
> > > +	ret = xe_bo_migrate(bo, XE_PL_TT, exec);
> > >  	if (ret) {
> > >  		if (ret != -EINTR && ret != -ERESTARTSYS)
> > >  			drm_dbg(&xe->drm,
> > > @@ -72,7 +73,7 @@ static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
> > >  		return ret;
> > >  	}
> > >  
> > > -	ret = xe_bo_pin_external(bo);
> > > +	ret = xe_bo_pin_external(bo, exec);
> > >  	xe_assert(xe, !ret);
> > >  
> > >  	return 0;
> > > @@ -92,6 +93,7 @@ static struct sg_table *xe_dma_buf_map(struct dma_buf_attachment *attach,
> > >  	struct dma_buf *dma_buf = attach->dmabuf;
> > >  	struct drm_gem_object *obj = dma_buf->priv;
> > >  	struct xe_bo *bo = gem_to_xe_bo(obj);
> > > +	struct drm_exec *exec = XE_VALIDATION_UNSUPPORTED;
> > >  	struct sg_table *sgt;
> > >  	int r = 0;
> > >  
> > > @@ -100,9 +102,9 @@ static struct sg_table *xe_dma_buf_map(struct dma_buf_attachment *attach,
> > >  
> > >  	if (!xe_bo_is_pinned(bo)) {
> > >  		if (!attach->peer2peer)
> > > -			r = xe_bo_migrate(bo, XE_PL_TT);
> > > +			r = xe_bo_migrate(bo, XE_PL_TT, exec);
> > >  		else
> > > -			r = xe_bo_validate(bo, NULL, false);
> > > +			r = xe_bo_validate(bo, NULL, false, exec);
> > >  		if (r)
> > >  			return ERR_PTR(r);
> > >  	}
> > > @@ -161,13 +163,14 @@ static int xe_dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
> > >  	struct xe_bo *bo = gem_to_xe_bo(obj);
> > >  	bool reads =  (direction == DMA_BIDIRECTIONAL ||
> > >  		       direction == DMA_FROM_DEVICE);
> > > +	struct drm_exec *exec = XE_VALIDATION_UNIMPLEMENTED;
> > >  
> > >  	if (!reads)
> > >  		return 0;
> > >  
> > >  	/* Can we do interruptible lock here? */
> > >  	xe_bo_lock(bo, false);
> > > -	(void)xe_bo_migrate(bo, XE_PL_TT);
> > > +	(void)xe_bo_migrate(bo, XE_PL_TT, exec);
> > >  	xe_bo_unlock(bo);
> > >  
> > >  	return 0;
> > > @@ -208,13 +211,14 @@ xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage,
> > >  {
> > >  	struct dma_resv *resv = dma_buf->resv;
> > >  	struct xe_device *xe = to_xe_device(dev);
> > > +	struct drm_exec *exec = XE_VALIDATION_UNIMPLEMENTED;
> > >  	struct xe_bo *bo;
> > >  	int ret;
> > >  
> > >  	dma_resv_lock(resv, NULL);
> > >  	bo = ___xe_bo_create_locked(xe, storage, NULL, resv, NULL, dma_buf->size,
> > >  				    0, /* Will require 1way or 2way for vm_bind */
> > > -				    ttm_bo_type_sg, XE_BO_FLAG_SYSTEM);
> > > +				    ttm_bo_type_sg, XE_BO_FLAG_SYSTEM, exec);
> > >  	if (IS_ERR(bo)) {
> > >  		ret = PTR_ERR(bo);
> > >  		goto error;
> > > @@ -232,8 +236,9 @@ static void xe_dma_buf_move_notify(struct dma_buf_attachment *attach)
> > >  {
> > >  	struct drm_gem_object *obj = attach->importer_priv;
> > >  	struct xe_bo *bo = gem_to_xe_bo(obj);
> > > +	struct drm_exec *exec = XE_VALIDATION_UNSUPPORTED;
> > >  
> > > -	XE_WARN_ON(xe_bo_evict(bo));
> > > +	XE_WARN_ON(xe_bo_evict(bo, exec));
> > >  }
> > >  
> > >  static const struct dma_buf_attach_ops xe_dma_buf_attach_ops = {
> > > diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c
> > > index 44364c042ad7..0bcb4fb9a10e 100644
> > > --- a/drivers/gpu/drm/xe/xe_exec.c
> > > +++ b/drivers/gpu/drm/xe/xe_exec.c
> > > @@ -97,9 +97,13 @@
> > >  static int xe_exec_fn(struct drm_gpuvm_exec *vm_exec)
> > >  {
> > >  	struct xe_vm *vm = container_of(vm_exec->vm, struct xe_vm, gpuvm);
> > > +	int ret;
> > >  
> > >  	/* The fence slot added here is intended for the exec sched job. */
> > > -	return xe_vm_validate_rebind(vm, &vm_exec->exec, 1);
> > > +	xe_vm_set_validation_exec(vm, &vm_exec->exec);
> > > +	ret = xe_vm_validate_rebind(vm, &vm_exec->exec, 1);
> > > +	xe_vm_set_validation_exec(vm, NULL);
> > > +	return ret;
> > >  }
> > >  
> > >  int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> > > diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
> > > index e03222f5ac5a..a47c0131956b 100644
> > > --- a/drivers/gpu/drm/xe/xe_ggtt.c
> > > +++ b/drivers/gpu/drm/xe/xe_ggtt.c
> > > @@ -731,7 +731,7 @@ void xe_ggtt_map_bo_unlocked(struct xe_ggtt *ggtt, struct xe_bo *bo)
> > >  }
> > >  
> > >  static int __xe_ggtt_insert_bo_at(struct xe_ggtt *ggtt, struct xe_bo *bo,
> > > -				  u64 start, u64 end)
> > > +				  u64 start, u64 end, struct drm_exec *exec)
> > >  {
> > >  	u64 alignment = bo->min_align > 0 ? bo->min_align : XE_PAGE_SIZE;
> > >  	u8 tile_id = ggtt->tile->id;
> > > @@ -746,7 +746,7 @@ static int __xe_ggtt_insert_bo_at(struct xe_ggtt *ggtt, struct xe_bo *bo,
> > >  		return 0;
> > >  	}
> > >  
> > > -	err = xe_bo_validate(bo, NULL, false);
> > > +	err = xe_bo_validate(bo, NULL, false, exec);
> > >  	if (err)
> > >  		return err;
> > >  
> > > @@ -788,25 +788,28 @@ static int __xe_ggtt_insert_bo_at(struct xe_ggtt *ggtt, struct xe_bo *bo,
> > >   * @bo: the &xe_bo to be inserted
> > >   * @start: address where it will be inserted
> > >   * @end: end of the range where it will be inserted
> > > + * @exec: The drm_exec transaction to use for exhaustive eviction.
> > >   *
> > >   * Return: 0 on success or a negative error code on failure.
> > >   */
> > >  int xe_ggtt_insert_bo_at(struct xe_ggtt *ggtt, struct xe_bo *bo,
> > > -			 u64 start, u64 end)
> > > +			 u64 start, u64 end, struct drm_exec *exec)
> > >  {
> > > -	return __xe_ggtt_insert_bo_at(ggtt, bo, start, end);
> > > +	return __xe_ggtt_insert_bo_at(ggtt, bo, start, end, exec);
> > >  }
> > >  
> > >  /**
> > >   * xe_ggtt_insert_bo - Insert BO into GGTT
> > >   * @ggtt: the &xe_ggtt where bo will be inserted
> > >   * @bo: the &xe_bo to be inserted
> > > + * @exec: The drm_exec transaction to use for exhaustive eviction.
> > >   *
> > >   * Return: 0 on success or a negative error code on failure.
> > >   */
> > > -int xe_ggtt_insert_bo(struct xe_ggtt *ggtt, struct xe_bo *bo)
> > > +int xe_ggtt_insert_bo(struct xe_ggtt *ggtt, struct xe_bo *bo,
> > > +		      struct drm_exec *exec)
> > >  {
> > > -	return __xe_ggtt_insert_bo_at(ggtt, bo, 0, U64_MAX);
> > > +	return __xe_ggtt_insert_bo_at(ggtt, bo, 0, U64_MAX, exec);
> > >  }
> > >  
> > >  /**
> > > diff --git a/drivers/gpu/drm/xe/xe_ggtt.h b/drivers/gpu/drm/xe/xe_ggtt.h
> > > index fbe1e397d05d..75fc7a1efea7 100644
> > > --- a/drivers/gpu/drm/xe/xe_ggtt.h
> > > +++ b/drivers/gpu/drm/xe/xe_ggtt.h
> > > @@ -10,6 +10,7 @@
> > >  
> > >  struct drm_printer;
> > >  struct xe_tile;
> > > +struct drm_exec;
> > >  
> > >  struct xe_ggtt *xe_ggtt_alloc(struct xe_tile *tile);
> > >  int xe_ggtt_init_early(struct xe_ggtt *ggtt);
> > > @@ -31,9 +32,9 @@ bool xe_ggtt_node_allocated(const struct xe_ggtt_node *node);
> > >  void xe_ggtt_map_bo(struct xe_ggtt *ggtt, struct xe_ggtt_node *node,
> > >  		    struct xe_bo *bo, u16 pat_index);
> > >  void xe_ggtt_map_bo_unlocked(struct xe_ggtt *ggtt, struct xe_bo *bo);
> > > -int xe_ggtt_insert_bo(struct xe_ggtt *ggtt, struct xe_bo *bo);
> > > +int xe_ggtt_insert_bo(struct xe_ggtt *ggtt, struct xe_bo *bo, struct drm_exec *exec);
> > >  int xe_ggtt_insert_bo_at(struct xe_ggtt *ggtt, struct xe_bo *bo,
> > > -			 u64 start, u64 end);
> > > +			 u64 start, u64 end, struct drm_exec *exec);
> > >  void xe_ggtt_remove_bo(struct xe_ggtt *ggtt, struct xe_bo *bo);
> > >  u64 xe_ggtt_largest_hole(struct xe_ggtt *ggtt, u64 alignment, u64 *spare);
> > >  
> > > diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > > index ab43dec52776..2c7f10cc423f 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > > +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
> > > @@ -94,12 +94,12 @@ static int xe_pf_begin(struct drm_exec *exec, struct xe_vma *vma,
> > >  		}
> > >  
> > >  		/* Migrate to VRAM, move should invalidate the VMA first */
> > > -		err = xe_bo_migrate(bo, vram->placement);
> > > +		err = xe_bo_migrate(bo, vram->placement, exec);
> > >  		if (err)
> > >  			return err;
> > >  	} else if (bo) {
> > >  		/* Create backing store if needed */
> > > -		err = xe_bo_validate(bo, vm, true);
> > > +		err = xe_bo_validate(bo, vm, true, exec);
> > >  		if (err)
> > >  			return err;
> > >  	}
> > > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > > index c8f0320d032f..906011671b60 100644
> > > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > > @@ -1452,6 +1452,7 @@ static bool pf_release_vf_config_lmem(struct xe_gt *gt, struct xe_gt_sriov_confi
> > >  static int pf_provision_vf_lmem(struct xe_gt *gt, unsigned int vfid, u64 size)
> > >  {
> > >  	struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
> > > +	struct drm_exec *exec = XE_VALIDATION_UNIMPLEMENTED;
> > >  	struct xe_device *xe = gt_to_xe(gt);
> > >  	struct xe_tile *tile = gt_to_tile(gt);
> > >  	struct xe_bo *bo;
> > > @@ -1484,11 +1485,12 @@ static int pf_provision_vf_lmem(struct xe_gt *gt, unsigned int vfid, u64 size)
> > >  				 XE_BO_FLAG_VRAM_IF_DGFX(tile) |
> > >  				 XE_BO_FLAG_NEEDS_2M |
> > >  				 XE_BO_FLAG_PINNED |
> > > -				 XE_BO_FLAG_PINNED_LATE_RESTORE);
> > > +				 XE_BO_FLAG_PINNED_LATE_RESTORE,
> > > +				 exec);
> > >  	if (IS_ERR(bo))
> > >  		return PTR_ERR(bo);
> > >  
> > > -	err = xe_bo_pin(bo);
> > > +	err = xe_bo_pin(bo, exec);
> > >  	xe_bo_unlock(bo);
> > >  	if (unlikely(err)) {
> > >  		xe_bo_put(bo);
> > > diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> > > index e35c6d4def20..39e3aa6df25a 100644
> > > --- a/drivers/gpu/drm/xe/xe_svm.c
> > > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > > @@ -700,6 +700,7 @@ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
> > >  	struct device *dev = xe->drm.dev;
> > >  	struct drm_buddy_block *block;
> > >  	struct list_head *blocks;
> > > +	struct drm_exec *exec;
> > >  	struct xe_bo *bo;
> > >  	ktime_t time_end = 0;
> > >  	int err, idx;
> > > @@ -708,12 +709,13 @@ static int xe_drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
> > >  		return -ENODEV;
> > >  
> > >  	xe_pm_runtime_get(xe);
> > > +	exec = XE_VALIDATION_UNIMPLEMENTED;
> > >  
> > >   retry:
> > >  	bo = xe_bo_create_locked(vr->xe, NULL, NULL, end - start,
> > >  				 ttm_bo_type_device,
> > >  				 (IS_DGFX(xe) ? XE_BO_FLAG_VRAM(vr) : XE_BO_FLAG_SYSTEM) |
> > > -				 XE_BO_FLAG_CPU_ADDR_MIRROR);
> > > +				 XE_BO_FLAG_CPU_ADDR_MIRROR, exec);
> > >  	if (IS_ERR(bo)) {
> > >  		err = PTR_ERR(bo);
> > >  		if (xe_vm_validate_should_retry(NULL, err, &time_end))
> > > diff --git a/drivers/gpu/drm/xe/xe_validation.c b/drivers/gpu/drm/xe/xe_validation.c
> > > new file mode 100644
> > > index 000000000000..cc0684d24e02
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/xe/xe_validation.c
> > > @@ -0,0 +1,49 @@
> > > +// SPDX-License-Identifier: MIT
> > > +/*
> > > + * Copyright © 2024 Intel Corporation
> > > + */
> > > +#include "xe_bo.h"
> > > +#include <drm/drm_exec.h>
> > > +#include <drm/drm_gem.h>
> > > +
> > > +#include "xe_assert.h"
> > > +#include "xe_validation.h"
> > > +
> > > +#ifdef CONFIG_DRM_XE_DEBUG
> > > +/**
> > > + * xe_validation_assert_exec() - Assert that the drm_exec pointer is suitable
> > > + * for validation.
> > > + * @xe: Pointer to the xe device.
> > > + * @exec: The drm_exec pointer to check.
> > > + * @obj: Pointer to the object subject to validation.
> > > + *
> > > + * NULL exec pointers are not allowed.
> > > + * For XE_VALIDATION_UNIMPLEMENTED, no checking.
> > > + * For XE_VLIDATION_OPT_OUT, check that the caller is a kunit test
> > > + * For XE_VALIDATION_UNSUPPORTED, check that the object subject to
> > > + * validation is a dma-buf, for which support for ww locking is
> > > + * not in place in the dma-buf layer.
> > > + */
> > > +void xe_validation_assert_exec(const struct xe_device *xe,
> > > +			       const struct drm_exec *exec,
> > > +			       const struct drm_gem_object *obj)
> > > +{
> > > +	xe_assert(xe, exec);
> > > +	if (IS_ERR(exec)) {
> > > +		switch (PTR_ERR(exec)) {
> > > +		case __XE_VAL_UNIMPLEMENTED:
> > > +			break;
> > > +		case __XE_VAL_UNSUPPORTED:
> > > +			xe_assert(xe, !!obj->dma_buf);
> > > +			break;
> > > +#if IS_ENABLED(CONFIG_KUNIT)
> > > +		case __XE_VAL_OPT_OUT:
> > > +			xe_assert(xe, current->kunit_test);
> > > +			break;
> > > +#endif
> > > +		default:
> > > +			xe_assert(xe, false);
> > > +		}
> > > +	}
> > > +}
> > > +#endif
> > > diff --git a/drivers/gpu/drm/xe/xe_validation.h b/drivers/gpu/drm/xe/xe_validation.h
> > > new file mode 100644
> > > index 000000000000..db50feacad7a
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/xe/xe_validation.h
> > > @@ -0,0 +1,69 @@
> > > +/* SPDX-License-Identifier: MIT */
> > > +/*
> > > + * Copyright © 2024 Intel Corporation
> > > + */
> > > +#ifndef _XE_VALIDATION_H_
> > > +#define _XE_VALIDATION_H_
> > > +
> > > +#include <linux/dma-resv.h>
> > > +#include <linux/types.h>
> > > +
> > > +struct drm_exec;
> > > +struct drm_gem_object;
> > > +struct xe_device;
> > > +
> > > +#ifdef CONFIG_PROVE_LOCKING
> > > +/**
> > > + * xe_validation_lockdep() - Assert that a drm_exec locking transaction can
> > > + * be initialized at this point.
> > > + */
> > > +static inline void xe_validation_lockdep(void)
> > > +{
> > > +	struct ww_acquire_ctx ticket;
> > > +
> > > +	ww_acquire_init(&ticket, &reservation_ww_class);
> > > +	ww_acquire_fini(&ticket);
> > > +}
> > > +#else
> > > +static inline void xe_validation_lockdep(void)
> > > +{
> > > +}
> > > +#endif
> > > +
> > > +/*
> > > + * Various values of the drm_exec pointer where we've not (yet)
> > > + * implemented full ww locking.
> > > + *
> > > + * XE_VALIDATION_UNIMPLEMENTED means implementation is pending.
> > > + * A lockdep check is made to assure that a drm_exec locking
> > > + * transaction can actually take place where the macro is
> > > + * used. If this asserts, the exec pointer needs to be assigned
> > > + * higher up in the callchain and passed down.
> > > + *
> > > + * XE_VALIDATION_UNSUPPORTED is for dma-buf code only where
> > > + * the dma-buf layer doesn't support WW locking.
> > > + *
> > > + * XE_VALIDATION_OPT_OUT is for simplification of kunit tests where
> > > + * exhaustive eviction isn't necessary.
> > > + */
> > > +#define __XE_VAL_UNIMPLEMENTED -EINVAL
> > > +#define XE_VALIDATION_UNIMPLEMENTED (xe_validation_lockdep(),		\
> > > +				     (struct drm_exec *)ERR_PTR(__XE_VAL_UNIMPLEMENTED))
> > > +
> > > +#define __XE_VAL_UNSUPPORTED -EOPNOTSUPP
> > > +#define XE_VALIDATION_UNSUPPORTED ((struct drm_exec *)ERR_PTR(__XE_VAL_UNSUPPORTED))
> > > +
> > > +#define __XE_VAL_OPT_OUT -ENOMEM
> > > +#define XE_VALIDATION_OPT_OUT (xe_validation_lockdep(), \
> > > +			       (struct drm_exec *)ERR_PTR(__XE_VAL_OPT_OUT))
> > > +#ifdef CONFIG_DRM_XE_DEBUG
> > > +void xe_validation_assert_exec(const struct xe_device *xe, const struct drm_exec *exec,
> > > +			       const struct drm_gem_object *obj);
> > > +#else
> > > +#define xe_validation_assert_exec(_xe, _exec, _obj)	\
> > > +	do {						\
> > > +		(void)_xe; (void)_exec; (void)_obj;	\
> > > +	} while (0)
> > > +#endif
> > > +
> > > +#endif
> > > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > > index 12e661960244..600aaadb4bee 100644
> > > --- a/drivers/gpu/drm/xe/xe_vm.c
> > > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > > @@ -393,7 +393,7 @@ static int xe_gpuvm_validate(struct drm_gpuvm_bo *vm_bo, struct drm_exec *exec)
> > >  		list_move_tail(&gpuva_to_vma(gpuva)->combined_links.rebind,
> > >  			       &vm->rebind_list);
> > >  
> > > -	ret = xe_bo_validate(gem_to_xe_bo(vm_bo->obj), vm, false);
> > > +	ret = xe_bo_validate(gem_to_xe_bo(vm_bo->obj), vm, false, exec);
> > >  	if (ret)
> > >  		return ret;
> > >  
> > > @@ -451,6 +451,7 @@ static int xe_preempt_work_begin(struct drm_exec *exec, struct xe_vm *vm,
> > >  	if (err)
> > >  		return err;
> > >  
> > > +	xe_vm_set_validation_exec(vm, exec);
> > >  	if (xe_vm_is_idle(vm)) {
> > >  		vm->preempt.rebind_deactivated = true;
> > >  		*done = true;
> > > @@ -516,6 +517,7 @@ static void preempt_rebind_work_func(struct work_struct *w)
> > >  		err = xe_preempt_work_begin(&exec, vm, &done);
> > >  		drm_exec_retry_on_contention(&exec);
> > >  		if (err || done) {
> > > +			xe_vm_set_validation_exec(vm, NULL);
> > >  			drm_exec_fini(&exec);
> > >  			if (err && xe_vm_validate_should_retry(&exec, err, &end))
> > >  				err = -EAGAIN;
> > > @@ -565,6 +567,7 @@ static void preempt_rebind_work_func(struct work_struct *w)
> > >  	up_read(&vm->userptr.notifier_lock);
> > >  
> > >  out_unlock:
> > > +	xe_vm_set_validation_exec(vm, NULL);
> > >  	drm_exec_fini(&exec);
> > >  out_unlock_outer:
> > >  	if (err == -EAGAIN) {
> > > @@ -1375,6 +1378,8 @@ int xe_vm_lock_vma(struct drm_exec *exec, struct xe_vma *vma)
> > >  	err = drm_exec_lock_obj(exec, xe_vm_obj(vm));
> > >  	if (!err && bo && !bo->vm)
> > >  		err = drm_exec_lock_obj(exec, &bo->ttm.base);
> > > +	if (!err)
> > > +		xe_vm_set_validation_exec(vm, exec);
> > 
> > 
> > Do you have imbalance here? I see this function called in xe_pf_begin
> > and xe_vma_destroy_unlocked but I don't see
> > xe_vm_set_validation_exec(vm, NULL) called.
> > 
> > 
> > >  
> > >  	return err;
> > >  }
> > > @@ -2889,7 +2894,7 @@ static int vma_lock_and_validate(struct drm_exec *exec, struct xe_vma *vma,
> > >  			err = drm_exec_lock_obj(exec, &bo->ttm.base);
> > >  		if (!err && validate)
> > >  			err = xe_bo_validate(bo, vm,
> > > -					     !xe_vm_in_preempt_fence_mode(vm));
> > > +					     !xe_vm_in_preempt_fence_mode(vm), exec);
> > >  	}
> > >  
> > >  	return err;
> > > @@ -3012,7 +3017,8 @@ static int op_lock_and_prep(struct drm_exec *exec, struct xe_vm *vm,
> > >  					    false);
> > >  		if (!err && !xe_vma_has_no_bo(vma))
> > >  			err = xe_bo_migrate(xe_vma_bo(vma),
> > > -					    region_to_mem_type[region]);
> > > +					    region_to_mem_type[region],
> > > +					    exec);
> > >  		break;
> > >  	}
> > >  	default:
> > > @@ -3052,6 +3058,7 @@ static int vm_bind_ioctl_ops_lock_and_prep(struct drm_exec *exec,
> > >  	if (err)
> > >  		return err;
> > >  
> > > +	xe_vm_set_validation_exec(vm, exec);
> > >  	list_for_each_entry(op, &vops->list, link) {
> > >  		err = op_lock_and_prep(exec, vm, op);
> > >  		if (err)
> > > @@ -3850,10 +3857,18 @@ struct dma_fence *xe_vm_bind_kernel_bo(struct xe_vm *vm, struct xe_bo *bo,
> > >   */
> > >  int xe_vm_lock(struct xe_vm *vm, bool intr)
> > >  {
> > > +	struct drm_exec *exec = XE_VALIDATION_UNIMPLEMENTED;
> > > +	int ret;
> > > +
> > >  	if (intr)
> > > -		return dma_resv_lock_interruptible(xe_vm_resv(vm), NULL);
> > > +		ret = dma_resv_lock_interruptible(xe_vm_resv(vm), NULL);
> > > +	else
> > > +		ret = dma_resv_lock(xe_vm_resv(vm), NULL);
> > > +
> > > +	if (!ret)
> > > +		xe_vm_set_validation_exec(vm, exec);
> > >  
> > > -	return dma_resv_lock(xe_vm_resv(vm), NULL);
> > > +	return ret;
> > >  }
> > >  
> > >  /**
> > > @@ -3864,6 +3879,7 @@ int xe_vm_lock(struct xe_vm *vm, bool intr)
> > >   */
> > >  void xe_vm_unlock(struct xe_vm *vm)
> > >  {
> > > +	xe_vm_set_validation_exec(vm, NULL);
> > >  	dma_resv_unlock(xe_vm_resv(vm));
> > >  }
> > >  
> > > diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
> > > index 2ecb417c19a2..4ba26eed7e96 100644
> > > --- a/drivers/gpu/drm/xe/xe_vm.h
> > > +++ b/drivers/gpu/drm/xe/xe_vm.h
> > > @@ -321,7 +321,7 @@ static inline void xe_vm_set_validating(struct xe_vm *vm, bool allow_res_evict)
> > >  	if (vm && !allow_res_evict) {
> > >  		xe_vm_assert_held(vm);
> > >  		/* Pairs with READ_ONCE in xe_vm_is_validating() */
> > > -		WRITE_ONCE(vm->validating, current);
> > > +		WRITE_ONCE(vm->validation.validating, current);
> > >  	}
> > >  }
> > >  
> > > @@ -339,7 +339,7 @@ static inline void xe_vm_clear_validating(struct xe_vm *vm, bool allow_res_evict
> > >  {
> > >  	if (vm && !allow_res_evict) {
> > >  		/* Pairs with READ_ONCE in xe_vm_is_validating() */
> > > -		WRITE_ONCE(vm->validating, NULL);
> > > +		WRITE_ONCE(vm->validation.validating, NULL);
> > >  	}
> > >  }
> > >  
> > > @@ -357,13 +357,40 @@ static inline void xe_vm_clear_validating(struct xe_vm *vm, bool allow_res_evict
> > >  static inline bool xe_vm_is_validating(struct xe_vm *vm)
> > >  {
> > >  	/* Pairs with WRITE_ONCE in xe_vm_is_validating() */
> > > -	if (READ_ONCE(vm->validating) == current) {
> > > +	if (READ_ONCE(vm->validation.validating) == current) {
> > >  		xe_vm_assert_held(vm);
> > >  		return true;
> > >  	}
> > >  	return false;
> > >  }
> > >  
> > > +/**
> > > + * xe_vm_set_validation_exec() - Accessor to set the drm_exec object
> > > + * @vm: The vm we want to register a drm_exec object with.
> > > + * @exec: The exec object we want to register.
> > > + *
> > > + * Set the drm_exec object used to lock the vm's resv.
> > > + */
> > > +static inline void xe_vm_set_validation_exec(struct xe_vm *vm, struct drm_exec *exec)
> > > +{
> > > +	xe_vm_assert_held(vm);
> > > +	vm->validation._exec = exec;
> > > +}
> > > +
> > > +/**
> > > + * xe_vm_set_validation_exec() - Accessor to read the drm_exec object
> > > + * @vm: The vm we want to register a drm_exec object with.
> > > + *
> > > + * Return: The drm_exec object used to lock the vm's resv. The value
> > > + * is a valid pointer, %NULL, or one of the special values defined in
> > > + * xe_validation.h.
> > > + */
> > > +static inline struct drm_exec *xe_vm_validation_exec(struct xe_vm *vm)
> > > +{
> > > +	xe_vm_assert_held(vm);
> > > +	return vm->validation._exec;
> > > +}
> > > +
> > >  /**
> > >   * xe_vm_has_valid_gpu_mapping() - Advisory helper to check if VMA or SVM range has
> > >   * a valid GPU mapping
> > > diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
> > > index 8a07feef503b..2f88808e36bb 100644
> > > --- a/drivers/gpu/drm/xe/xe_vm_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_vm_types.h
> > > @@ -312,19 +312,35 @@ struct xe_vm {
> > >  		bool capture_once;
> > >  	} error_capture;
> > >  
> > > +	/**
> > > +	 * @validation: Validation data only valid with the vm resv held.
> > > +	 * Note: This is really task state of the task holding the vm resv,
> > > +	 * and moving forward we should
> > > +	 * come up with a better way of passing this down the call-
> > > +	 * chain.
> > 
> > 
> > I've already mentioned this, attaching the _exec xe_vma_ops might be
> > good option as xe_vma_ops has lifetime of only existing for the bind
> > (i.e., it is stack variable) so you'd only need to set it (i.e., no
> > clear required).
> > 
> > I think patch largely makes sense.
> > 
> > Matt 
> > 
> > 
> > > +	 */
> > > +	struct {
> > > +		/**
> > > +		 * @validation.validating: The task that is currently making bos resident.
> > > +		 * for this vm.
> > > +		 * Protected by the VM's resv for writing. Opportunistic reading can be done
> > > +		 * using READ_ONCE. Note: This is a workaround for the
> > > +		 * TTM eviction_valuable() callback not being passed a struct
> > > +		 * ttm_operation_context(). Future work might want to address this.
> > > +		 */
> > > +		struct task_struct *validating;
> > > +		/**
> > > +		 *  @validation.exec The drm_exec context used when locking the vm resv.
> > > +		 *  Protected by the vm's resv.
> > > +		 */
> > > +		struct drm_exec *_exec;
> > > +	} validation;
> > > +
> > >  	/**
> > >  	 * @tlb_flush_seqno: Required TLB flush seqno for the next exec.
> > >  	 * protected by the vm resv.
> > >  	 */
> > >  	u64 tlb_flush_seqno;
> > > -	/**
> > > -	 * @validating: The task that is currently making bos resident for this vm.
> > > -	 * Protected by the VM's resv for writing. Opportunistic reading can be done
> > > -	 * using READ_ONCE. Note: This is a workaround for the
> > > -	 * TTM eviction_valuable() callback not being passed a struct
> > > -	 * ttm_operation_context(). Future work might want to address this.
> > > -	 */
> > > -	struct task_struct *validating;
> > >  	/** @batch_invalidate_tlb: Always invalidate TLB before batch start */
> > >  	bool batch_invalidate_tlb;
> > >  	/** @xef: XE file handle for tracking this VM's drm client */
> > > -- 
> > > 2.50.1
> > > 
> >
> 
> 
> 


More information about the Intel-xe mailing list