[PATCH] drm/xe: Opportunistically skip TLB invalidaion on unbind

Fri Jun 13 08:24:32 UTC 2025

On Thu, 2025-06-12 at 21:36 -0700, Matthew Brost wrote:
> If a range or VMA is invalidated and scratched page is disabled,
> there
> is no reason to issue a TLB invalidation on unbind, skip TLB
> innvalidation is this condition is true. This is an opportunistic
> check
> as it is done without the notifier lock, thus it possible for the
> range
> or VMA to be invalidated after this check is performed.
> 
> This should improve performance of the SVM garbage collector, for
> example, xe_exec_system_allocator --r many-stride-new-prefetch, went
> ~20s to ~9.5s on a BMG.
> 
> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_pt.c  | 18 ++++++++++++++++--
>  drivers/gpu/drm/xe/xe_svm.c |  5 ++++-
>  drivers/gpu/drm/xe/xe_vm.c  |  5 ++++-
>  3 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index f39d5cc9f411..09c3ccc81cca 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -1988,7 +1988,14 @@ static int unbind_op_prepare(struct xe_tile
> *tile,
>  					 xe_vma_end(vma));
>  	++pt_update_ops->current_op;
>  	pt_update_ops->needs_userptr_lock |= xe_vma_is_userptr(vma);
> -	pt_update_ops->needs_invalidation = true;
> +
> +	/*
> +	 * Opportunistically supressing invalidation, READ_ONCE
> pairs with
> +	 * WRITE_ONCE in MMU notifier or BO move
> +	 */
> +	pt_update_ops->needs_invalidation |=
> xe_vm_has_scratch(xe_vma_vm(vma)) ||
> +		((vma->tile_present & BIT(tile->id)) &
> +		 ~READ_ONCE(vma->tile_invalidated));
>  
>  	xe_pt_commit_prepare_unbind(vma, pt_op->entries, pt_op-
> >num_entries);
>  
> @@ -2023,7 +2030,14 @@ static int unbind_range_prepare(struct xe_vm
> *vm,
>  					 range->base.itree.last +
> 1);
>  	++pt_update_ops->current_op;
>  	pt_update_ops->needs_svm_lock = true;
> -	pt_update_ops->needs_invalidation = true;
> +
> +	/*
> +	 * Opportunistically supressing invalidation, READ_ONCE
> pairs with
> +	 * WRITE_ONCE in SVM MMU notifier

To avoid having to document the pairing for all use, perhaps some
tile_invalidated accessors?

> +	 */
> +	pt_update_ops->needs_invalidation |= xe_vm_has_scratch(vm)
> ||
> +		((range->tile_present & BIT(tile->id)) &
> +		 ~READ_ONCE(range->tile_invalidated));

Would it be possible to code this repeated pattern as a function?

xe_vm_needs_invalidaion(vm, tile, tile_present, tile_invalidated);

Perhaps doesn't improve much on readability. Up to you.

Otherwise LGTM.
Thomas

>  
>  	xe_pt_commit_prepare_unbind(XE_INVALID_VMA, pt_op->entries,
>  				    pt_op->num_entries);
> diff --git a/drivers/gpu/drm/xe/xe_svm.c
> b/drivers/gpu/drm/xe/xe_svm.c
> index 13abc6049041..5e5bf47293ad 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -141,7 +141,10 @@ xe_svm_range_notifier_event_begin(struct xe_vm
> *vm, struct drm_gpusvm_range *r,
>  	for_each_tile(tile, xe, id)
>  		if (xe_pt_zap_ptes_range(tile, vm, range)) {
>  			tile_mask |= BIT(id);
> -			/* Pairs with READ_ONCE in
> xe_svm_range_is_valid */
> +			/*
> +			 * Pairs with READ_ONCE in
> xe_svm_range_is_valid or PT
> +			 * code to suppress invalidation on unbind
> +			 */
>  			WRITE_ONCE(range->tile_invalidated,
>  				   range->tile_invalidated |
> BIT(id));
>  		}
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index d18807b92b18..b296ac37347b 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -3924,7 +3924,10 @@ int xe_vm_invalidate_vma(struct xe_vma *vma)
>  	for (id = 0; id < fence_id; ++id)
>  		xe_gt_tlb_invalidation_fence_wait(&fence[id]);
>  
> -	/* WRITE_ONCE pair with READ_ONCE in xe_gt_pagefault.c */
> +	/*
> +	 * WRITE_ONCE pair with READ_ONCE in xe_gt_pagefault.c or PT
> code to
> +	 * suppress invalidation on unbind
> +	 */
>  	WRITE_ONCE(vma->tile_invalidated, vma->tile_mask);
>  
>  	return ret;