[PATCH 1/3] drm/xe: Add a function to zap page table by address range

Tue Jan 28 23:38:29 UTC 2025

On Tue, Jan 28, 2025 at 05:21:43PM -0500, Oak Zeng wrote:
> Add a function xe_pt_zap_range. This is similar to xe_pt_zap_ptes
> but is used when we don't have a vma to work with, such as zap
> a range mapped to scratch page where we don't have vma.
> 

See my reply to the following patch. This won't work and isn't needed.

Let me explain why this won't work.

When we set up scratch PTEs on a VM, we create shell PTE structure to
largest page size (1 GB), which we 1 GB entry pointing to a scratch
page.

With 48 bits of address space, this is 1 level to 1 GB pages, with 57
bits this is 2 levels to 1 GB pages.

Now, let's say we perform a 4K bind from 0x0 to 0x1000 with immediate
clear in fault mode.

How can we invalidate just the 4K range of memory? We can't. We need to
allocate new memory for page tables so that 0x0000–0x1000 points to an
invalidated PTE, while 0x1000–1 GB points to scratch. This is why, in
my reply to the patch, I indicated that the entire bind pipeline needs
to be updated to properly invalidate the PTEs.

I hope this explanation, along with my reply to the next patch, makes
sense.

Matt

> Signed-off-by: Oak Zeng <oak.zeng at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_pt.c | 28 ++++++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_pt.h |  2 ++
>  2 files changed, 30 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> index 1ddcc7e79a93..2363260da6a6 100644
> --- a/drivers/gpu/drm/xe/xe_pt.c
> +++ b/drivers/gpu/drm/xe/xe_pt.c
> @@ -792,6 +792,34 @@ static const struct xe_pt_walk_ops xe_pt_zap_ptes_ops = {
>  	.pt_entry = xe_pt_zap_ptes_entry,
>  };
>  
> +/**
> + * xe_pt_zap_range() - Zap (zero) gpu ptes of an virtual address range
> + * @tile: The tile we're zapping for.
> + * @vm: The vm we're zapping for.
> + * @start: Start of the virtual address range, inclusive.
> + * @end: End of the virtual address range, exclusive.
> + *
> + * This is similar to xe_pt_zap_ptes() but it's used when we don't have a
> + * vma to work with. This is used for example when we're clearing the scratch
> + * page mapping during vm_bind.
> + *
> + */
> +void xe_pt_zap_range(struct xe_tile *tile, struct xe_vm *vm, u64 start, u64 end)
> +{
> +	struct xe_pt_zap_ptes_walk xe_walk = {
> +		.base = {
> +			.ops = &xe_pt_zap_ptes_ops,
> +			.shifts = xe_normal_pt_shifts,
> +			.max_level = XE_PT_HIGHEST_LEVEL,
> +		},
> +		.tile = tile,
> +	};
> +	struct xe_pt *pt = vm->pt_root[tile->id];
> +
> +	(void)xe_pt_walk_shared(&pt->base, pt->level, start,
> +				end, &xe_walk.base);
> +}
> +
>  /**
>   * xe_pt_zap_ptes() - Zap (zero) gpu ptes of an address range
>   * @tile: The tile we're zapping for.
> diff --git a/drivers/gpu/drm/xe/xe_pt.h b/drivers/gpu/drm/xe/xe_pt.h
> index 9ab386431cad..b166b324f455 100644
> --- a/drivers/gpu/drm/xe/xe_pt.h
> +++ b/drivers/gpu/drm/xe/xe_pt.h
> @@ -43,4 +43,6 @@ void xe_pt_update_ops_abort(struct xe_tile *tile, struct xe_vma_ops *vops);
>  
>  bool xe_pt_zap_ptes(struct xe_tile *tile, struct xe_vma *vma);
>  
> +void xe_pt_zap_range(struct xe_tile *tile, struct xe_vm *vm, u64 start, u64 end);
> +
>  #endif
> -- 
> 2.26.3
>