[Intel-xe] [PATCH] drm/xe/xe_migrate.c: Use DPA offset for page table entries.

Niranjana Vishwanathapura niranjana.vishwanathapura at intel.com
Fri Sep 29 18:59:19 UTC 2023


On Fri, Sep 29, 2023 at 08:01:47AM -0700, Ruhl, Michael J wrote:
>>-----Original Message-----
>>From: Vishwanathapura, Niranjana <niranjana.vishwanathapura at intel.com>
>>Sent: Thursday, September 28, 2023 9:49 PM
>>To: Kershner, David <david.kershner at intel.com>
>>Cc: intel-xe at lists.freedesktop.org; Ruhl, Michael J <michael.j.ruhl at intel.com>;
>>Fleck, John <john.fleck at intel.com>; Roper, Matthew D
>><matthew.d.roper at intel.com>; Brost, Matthew <matthew.brost at intel.com>;
>>Summers, Stuart <stuart.summers at intel.com>
>>Subject: Re: [PATCH] drm/xe/xe_migrate.c: Use DPA offset for page table
>>entries.
>>
>>On Thu, Sep 28, 2023 at 09:48:16AM -0400, David Kershner wrote:
>>>Device Physical Address (DPA) is the starting offset device memory.
>>>
>>>Update xe_migrate identity map base PTE entries to start at dpa_base
>>>instead of 0.
>>>
>>>The VM offset value should be 0 relative instead of DPA relative.
>>>
>>>Signed-off-by: David Kershner <david.kershner at intel.com>
>>>---
>>> drivers/gpu/drm/xe/xe_migrate.c | 19 +++++++++++--------
>>> 1 file changed, 11 insertions(+), 8 deletions(-)
>>>
>>>diff --git a/drivers/gpu/drm/xe/xe_migrate.c
>>b/drivers/gpu/drm/xe/xe_migrate.c
>>>index 9438f609d18b..848cdb6af2e6 100644
>>>--- a/drivers/gpu/drm/xe/xe_migrate.c
>>>+++ b/drivers/gpu/drm/xe/xe_migrate.c
>>>@@ -114,8 +114,10 @@ static u64 xe_migrate_vm_addr(u64 slot, u32 level)
>>> 	return (slot + 1ULL) << xe_pt_shift(level + 1);
>>> }
>>>
>>>-static u64 xe_migrate_vram_ofs(u64 addr)
>>>+static u64 xe_migrate_vram_ofs(u64 addr, struct xe_tile *tile)
>>> {
>>>+	if (tile)
>>>+		addr -= tile->mem.vram.dpa_base;
>>
>>See below...
>>
>>> 	return addr + (256ULL << xe_pt_shift(2));
>>> }
>>>
>>>@@ -149,7 +151,7 @@ static int xe_migrate_create_cleared_bo(struct
>>xe_migrate *m, struct xe_vm *vm)
>>>
>>> 	xe_map_memset(xe, &m->cleared_bo->vmap, 0, 0x00, cleared_size);
>>> 	vram_addr = xe_bo_addr(m->cleared_bo, 0, XE_PAGE_SIZE);
>>>-	m->cleared_vram_ofs = xe_migrate_vram_ofs(vram_addr);
>>>+	m->cleared_vram_ofs = xe_migrate_vram_ofs(vram_addr, tile);
>>>
>>> 	return 0;
>>> }
>>>@@ -223,12 +225,12 @@ static int xe_migrate_prepare_vm(struct xe_tile
>>*tile, struct xe_migrate *m,
>>> 	} else {
>>> 		u64 batch_addr = xe_bo_addr(batch, 0, XE_PAGE_SIZE);
>>>
>>>-		m->batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
>>>+		m->batch_base_ofs = xe_migrate_vram_ofs(batch_addr, tile);
>>>
>>> 		if (xe->info.supports_usm) {
>>> 			batch = tile->primary_gt->usm.bb_pool->bo;
>>> 			batch_addr = xe_bo_addr(batch, 0, XE_PAGE_SIZE);
>>>-			m->usm_batch_base_ofs =
>>xe_migrate_vram_ofs(batch_addr);
>>>+			m->usm_batch_base_ofs =
>>xe_migrate_vram_ofs(batch_addr, tile);
>>> 		}
>>> 	}
>>>
>>>@@ -266,7 +268,9 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile,
>>struct xe_migrate *m,
>>> 		 * Use 1GB pages, it shouldn't matter the physical amount of
>>> 		 * vram is less, when we don't access it.
>>> 		 */
>>>-		for (pos = 0; pos < xe->mem.vram.actual_physical_size; pos +=
>>SZ_1G, ofs += 8)
>>>+		for (pos = tile->mem.vram.dpa_base;
>>>+		     pos < tile->mem.vram.actual_physical_size + tile-
>>>mem.vram.dpa_base;
>>>+		     pos += SZ_1G, ofs += 8)
>>> 			xe_map_wr(xe, &bo->vmap, ofs, u64, pos | flags);
>>
>>I think this is incorrect.
>>We need to map the whole device LMEM, no just the tile LMEM for each tile.
>>This is because we should be able to move the bo from one tile to another.
>
>My understanding, is that the page table is created for the tile (there is one per tile),
>to migrate data to and from the tile.
>
>How is this different from moving from sysmem to vram? Where is the PTE info for
>the sysmem side of this?
>
>If this is for a general purpose page table, should this be created for the device,
>rather than having two page table with the same info?
>
>I see that the only tests that use this are sysmem -> device and device -> sysmem.
>
>Is there a use case test that shows a migration from one tile to the other?  Would like
>to test this.
>

We can have xe_migrate_copy() function getting called with same object for src_bo
and dst_bo, but different lmem regions for src and dst.

We may not have a good IGT, but I can think of VM prefetch (XE_VM_BIND_OP_PREFETCH)
and access counters as use cases that can trigger migration of BO between tiles.

>>And the identity map I think should start from device dpa (xe->mem.vram.dpa_base)
>>instead of 0-based. The xe->mem.vram.dpa_base gets updated based on whats
>>is >programed into PKG_ADDR_BASE.
>
>This makes sense if there is only one table.  Since the created for each tile, it will
>have some over head.
>
>For instance, the offset for the VMs are currently need the tile offset included.  The
>new code does not need that.  If we use the identity mapping with the full space we
>will have to break the tile offset out as a separate value to keep track  (will need to be
>added per tile..).
>

I did not fully understand your question. xe_bo_addr() has the dpa_base included (not the
tile offset).

Niranjana

>Thoughts?
>
>Mike
>
>>Niranjana
>>
>>> 	}
>>>
>>>@@ -441,8 +445,7 @@ static u32 pte_update_size(struct xe_migrate *m,
>>> 		cmds += cmd_size;
>>> 	} else {
>>> 		/* Offset into identity map. */
>>>-		*L0_ofs = xe_migrate_vram_ofs(cur->start +
>>>-					      vram_region_gpu_offset(res));
>>>+		*L0_ofs = xe_migrate_vram_ofs(cur->start, NULL);
>>> 		cmds += cmd_size;
>>> 	}
>>>
>>>@@ -1027,7 +1030,7 @@ static void write_pgtable(struct xe_tile *tile, struct
>>xe_bb *bb, u64 ppgtt_ofs,
>>> 	xe_tile_assert(tile, update->qwords <= 0x1ff);
>>> 	if (!ppgtt_ofs) {
>>> 		ppgtt_ofs = xe_migrate_vram_ofs(xe_bo_addr(update->pt_bo,
>>0,
>>>-							   XE_PAGE_SIZE));
>>>+							   XE_PAGE_SIZE), tile);
>>> 	}
>>>
>>> 	do {
>>>--
>>>2.35.1
>>>


More information about the Intel-xe mailing list