[PATCH] drm/xe: Reject BO eviction if BO is bound to current VM
Thomas Hellström
thomas.hellstrom at linux.intel.com
Fri Jan 10 14:45:00 UTC 2025
Hi, Oak.
On Tue, 2024-12-17 at 18:13 -0500, Oak Zeng wrote:
> This is a follow up fix for
> https://patchwork.freedesktop.org/patch/msgid/20241203021929.1919730-1-oak.zeng@intel.com
> The overall goal is to fail vm_bind when there is memory pressure.
> See more
> details in the commit message of above patch. Abbove patch fixes the
> issue
> when user pass in a vm_id parameter during gem_create. If user
> doesn't pass
> in a vm_id during gem_create, above patch doesn't help.
>
> This patch further reject BO eviction (which could be triggered by bo
> validation)
> if BO is bound to the current VM. vm_bind could fail due to the
> eviction failure.
> The BO to VM reverse mapping structure is used to determine whether
> BO is bound
> to VM.
>
> Suggested-by: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> Signed-off-by: Oak Zeng <oak.zeng at intel.com>
> ---
> drivers/gpu/drm/xe/xe_bo.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 283cd02945708..abdeed1c325ea 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -664,6 +664,7 @@ static int xe_bo_move(struct ttm_buffer_object
> *ttm_bo, bool evict,
> u32 old_mem_type = old_mem ? old_mem->mem_type :
> XE_PL_SYSTEM;
> struct ttm_tt *ttm = ttm_bo->ttm;
> struct xe_migrate *migrate = NULL;
> + struct drm_gpuvm_bo *vm_bo;
Move this declaration inside if (evict) {
> struct dma_fence *fence;
> bool move_lacks_source;
> bool tt_has_data;
> @@ -713,6 +714,18 @@ static int xe_bo_move(struct ttm_buffer_object
> *ttm_bo, bool evict,
> goto out;
> }
>
Short comment here?
> + if (evict) {
Perhaps if (evict && ctx->resv)
> + drm_gem_for_each_gpuvm_bo(vm_bo, &bo->ttm.base) {
> + struct xe_vm *vm = gpuvm_to_vm(vm_bo->vm);
> +
> + if (xe_vm_resv(vm) == ctx->resv &&
> + xe_vm_in_preempt_fence_mode(vm)) {
> + ret = -EBUSY;
> + goto out;
> + }
> + }
> + }
> +
Otherwise LGTM.
> /*
> * Failed multi-hop where the old_mem is still marked as
> * TTM_PL_FLAG_TEMPORARY, should just be a dummy move.
More information about the Intel-xe
mailing list