[Intel-xe] [PATCH] drm/xe/xe_migrate.c: Use DPA offset for page table entries.

Kershner, David david.kershner at intel.com
Fri Sep 29 22:45:08 UTC 2023



> -----Original Message-----
> From: Vishwanathapura, Niranjana <niranjana.vishwanathapura at intel.com>
> Sent: Friday, September 29, 2023 2:59 PM
> To: Ruhl, Michael J <michael.j.ruhl at intel.com>
> Cc: Kershner, David <david.kershner at intel.com>; intel-
> xe at lists.freedesktop.org; Fleck, John <john.fleck at intel.com>; Roper,
> Matthew D <matthew.d.roper at intel.com>; Brost, Matthew
> <matthew.brost at intel.com>; Summers, Stuart <stuart.summers at intel.com>
> Subject: Re: [PATCH] drm/xe/xe_migrate.c: Use DPA offset for page table
> entries.
> 
> On Fri, Sep 29, 2023 at 08:01:47AM -0700, Ruhl, Michael J wrote:
> >>-----Original Message-----
> >>From: Vishwanathapura, Niranjana
> <niranjana.vishwanathapura at intel.com>
> >>Sent: Thursday, September 28, 2023 9:49 PM
> >>To: Kershner, David <david.kershner at intel.com>
> >>Cc: intel-xe at lists.freedesktop.org; Ruhl, Michael J
> >><michael.j.ruhl at intel.com>; Fleck, John <john.fleck at intel.com>; Roper,
> >>Matthew D <matthew.d.roper at intel.com>; Brost, Matthew
> >><matthew.brost at intel.com>; Summers, Stuart
> <stuart.summers at intel.com>
> >>Subject: Re: [PATCH] drm/xe/xe_migrate.c: Use DPA offset for page
> >>table entries.
> >>
> >>On Thu, Sep 28, 2023 at 09:48:16AM -0400, David Kershner wrote:
> >>>Device Physical Address (DPA) is the starting offset device memory.
> >>>
> >>>Update xe_migrate identity map base PTE entries to start at dpa_base
> >>>instead of 0.
> >>>
> >>>The VM offset value should be 0 relative instead of DPA relative.
> >>>
> >>>Signed-off-by: David Kershner <david.kershner at intel.com>
> >>>---
> >>> drivers/gpu/drm/xe/xe_migrate.c | 19 +++++++++++--------
> >>> 1 file changed, 11 insertions(+), 8 deletions(-)
> >>>
> >>>diff --git a/drivers/gpu/drm/xe/xe_migrate.c
> >>b/drivers/gpu/drm/xe/xe_migrate.c
> >>>index 9438f609d18b..848cdb6af2e6 100644
> >>>--- a/drivers/gpu/drm/xe/xe_migrate.c
> >>>+++ b/drivers/gpu/drm/xe/xe_migrate.c
> >>>@@ -114,8 +114,10 @@ static u64 xe_migrate_vm_addr(u64 slot, u32
> level)
> >>> 	return (slot + 1ULL) << xe_pt_shift(level + 1);  }
> >>>
> >>>-static u64 xe_migrate_vram_ofs(u64 addr)
> >>>+static u64 xe_migrate_vram_ofs(u64 addr, struct xe_tile *tile)
> >>> {
> >>>+	if (tile)
> >>>+		addr -= tile->mem.vram.dpa_base;
> >>
> >>See below...
> >>
> >>> 	return addr + (256ULL << xe_pt_shift(2)); }
> >>>
> >>>@@ -149,7 +151,7 @@ static int xe_migrate_create_cleared_bo(struct
> >>xe_migrate *m, struct xe_vm *vm)
> >>>
> >>> 	xe_map_memset(xe, &m->cleared_bo->vmap, 0, 0x00, cleared_size);
> >>> 	vram_addr = xe_bo_addr(m->cleared_bo, 0, XE_PAGE_SIZE);
> >>>-	m->cleared_vram_ofs = xe_migrate_vram_ofs(vram_addr);
> >>>+	m->cleared_vram_ofs = xe_migrate_vram_ofs(vram_addr, tile);
> >>>
> >>> 	return 0;
> >>> }
> >>>@@ -223,12 +225,12 @@ static int xe_migrate_prepare_vm(struct xe_tile
> >>*tile, struct xe_migrate *m,
> >>> 	} else {
> >>> 		u64 batch_addr = xe_bo_addr(batch, 0, XE_PAGE_SIZE);
> >>>
> >>>-		m->batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
> >>>+		m->batch_base_ofs = xe_migrate_vram_ofs(batch_addr, tile);
> >>>
> >>> 		if (xe->info.supports_usm) {
> >>> 			batch = tile->primary_gt->usm.bb_pool->bo;
> >>> 			batch_addr = xe_bo_addr(batch, 0, XE_PAGE_SIZE);
> >>>-			m->usm_batch_base_ofs =
> >>xe_migrate_vram_ofs(batch_addr);
> >>>+			m->usm_batch_base_ofs =
> >>xe_migrate_vram_ofs(batch_addr, tile);
> >>> 		}
> >>> 	}
> >>>
> >>>@@ -266,7 +268,9 @@ static int xe_migrate_prepare_vm(struct xe_tile
> >>>*tile,
> >>struct xe_migrate *m,
> >>> 		 * Use 1GB pages, it shouldn't matter the physical amount of
> >>> 		 * vram is less, when we don't access it.
> >>> 		 */
> >>>-		for (pos = 0; pos < xe->mem.vram.actual_physical_size; pos +=
> >>SZ_1G, ofs += 8)
> >>>+		for (pos = tile->mem.vram.dpa_base;
> >>>+		     pos < tile->mem.vram.actual_physical_size + tile-
> >>>mem.vram.dpa_base;
> >>>+		     pos += SZ_1G, ofs += 8)
> >>> 			xe_map_wr(xe, &bo->vmap, ofs, u64, pos | flags);
> >>
> >>I think this is incorrect.
> >>We need to map the whole device LMEM, no just the tile LMEM for each tile.
> >>This is because we should be able to move the bo from one tile to another.
> >
> >My understanding, is that the page table is created for the tile (there
> >is one per tile), to migrate data to and from the tile.
> >
> >How is this different from moving from sysmem to vram? Where is the PTE
> >info for the sysmem side of this?
> >
> >If this is for a general purpose page table, should this be created for
> >the device, rather than having two page table with the same info?
> >
> >I see that the only tests that use this are sysmem -> device and device ->
> sysmem.
> >
> >Is there a use case test that shows a migration from one tile to the
> >other?  Would like to test this.
> >
> 
> We can have xe_migrate_copy() function getting called with same object for
> src_bo and dst_bo, but different lmem regions for src and dst.
> 
> We may not have a good IGT, but I can think of VM prefetch
> (XE_VM_BIND_OP_PREFETCH) and access counters as use cases that can
> trigger migration of BO between tiles.
> 
> >>And the identity map I think should start from device dpa
> >>(xe->mem.vram.dpa_base) instead of 0-based. The xe-
> >mem.vram.dpa_base
> >>gets updated based on whats is >programed into PKG_ADDR_BASE.
> >
> >This makes sense if there is only one table.  Since the created for
> >each tile, it will have some over head.
> >
> >For instance, the offset for the VMs are currently need the tile offset
> >included.  The new code does not need that.  If we use the identity
> >mapping with the full space we will have to break the tile offset out
> >as a separate value to keep track  (will need to be added per tile..).
> >
> 
> I did not fully understand your question. xe_bo_addr() has the dpa_base
> included (not the tile offset).
> 

I'm confused, xe_bo_addr returns the offset based off of vram_region_gpu_offset, 
which returns the tile based dpa_base, which has the tile_offset in it as well.

Should it be returning xe->mem.vram.dpa_base instead?

David Kershner


> Niranjana
> 
> >Thoughts?
> >
> >Mike
> >
> >>Niranjana
> >>
> >>> 	}
> >>>
> >>>@@ -441,8 +445,7 @@ static u32 pte_update_size(struct xe_migrate *m,
> >>> 		cmds += cmd_size;
> >>> 	} else {
> >>> 		/* Offset into identity map. */
> >>>-		*L0_ofs = xe_migrate_vram_ofs(cur->start +
> >>>-					      vram_region_gpu_offset(res));
> >>>+		*L0_ofs = xe_migrate_vram_ofs(cur->start, NULL);
> >>> 		cmds += cmd_size;
> >>> 	}
> >>>
> >>>@@ -1027,7 +1030,7 @@ static void write_pgtable(struct xe_tile *tile,
> >>>struct
> >>xe_bb *bb, u64 ppgtt_ofs,
> >>> 	xe_tile_assert(tile, update->qwords <= 0x1ff);
> >>> 	if (!ppgtt_ofs) {
> >>> 		ppgtt_ofs = xe_migrate_vram_ofs(xe_bo_addr(update-
> >pt_bo,
> >>0,
> >>>-							   XE_PAGE_SIZE));
> >>>+							   XE_PAGE_SIZE), tile);
> >>> 	}
> >>>
> >>> 	do {
> >>>--
> >>>2.35.1
> >>>


More information about the Intel-xe mailing list