[PATCH] drm/xe: Reject BO eviction if BO is bound to current VM

Thomas Hellström thomas.hellstrom at linux.intel.com
Fri Jan 10 14:45:00 UTC 2025


Hi, Oak.

On Tue, 2024-12-17 at 18:13 -0500, Oak Zeng wrote:
> This is a follow up fix for
> https://patchwork.freedesktop.org/patch/msgid/20241203021929.1919730-1-oak.zeng@intel.com
> The overall goal is to fail vm_bind when there is memory pressure.
> See more
> details in the commit message of above patch. Abbove patch fixes the
> issue
> when user pass in a vm_id parameter during gem_create. If user
> doesn't pass
> in a vm_id during gem_create, above patch doesn't help.
> 
> This patch further reject BO eviction (which could be triggered by bo
> validation)
> if BO is bound to the current VM. vm_bind could fail due to the
> eviction failure.
> The BO to VM reverse mapping structure is used to determine whether
> BO is bound
> to VM.
> 
> Suggested-by: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> Signed-off-by: Oak Zeng <oak.zeng at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_bo.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 283cd02945708..abdeed1c325ea 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -664,6 +664,7 @@ static int xe_bo_move(struct ttm_buffer_object
> *ttm_bo, bool evict,
>  	u32 old_mem_type = old_mem ? old_mem->mem_type :
> XE_PL_SYSTEM;
>  	struct ttm_tt *ttm = ttm_bo->ttm;
>  	struct xe_migrate *migrate = NULL;
> +	struct drm_gpuvm_bo *vm_bo;

Move this declaration inside if (evict) {

>  	struct dma_fence *fence;
>  	bool move_lacks_source;
>  	bool tt_has_data;
> @@ -713,6 +714,18 @@ static int xe_bo_move(struct ttm_buffer_object
> *ttm_bo, bool evict,
>  		goto out;
>  	}
>  

Short comment here?

> +	if (evict) {

Perhaps if (evict && ctx->resv)

> +		drm_gem_for_each_gpuvm_bo(vm_bo, &bo->ttm.base) {
> +			struct xe_vm *vm = gpuvm_to_vm(vm_bo->vm);
> +
> +			if (xe_vm_resv(vm) == ctx->resv &&
> +			    xe_vm_in_preempt_fence_mode(vm)) {
> +				ret = -EBUSY;
> +				goto out;
> +			}
> +		}
> +	}
> +

Otherwise LGTM.

>  	/*
>  	 * Failed multi-hop where the old_mem is still marked as
>  	 * TTM_PL_FLAG_TEMPORARY, should just be a dummy move.



More information about the Intel-xe mailing list