[Intel-xe] [PATCH] drm/xe/xe_migrate.c: Use DPA offset for page table entries.

Niranjana Vishwanathapura niranjana.vishwanathapura at intel.com
Sat Sep 30 20:11:28 UTC 2023


On Fri, Sep 29, 2023 at 03:45:08PM -0700, Kershner, David wrote:
>
>
>> -----Original Message-----
>> From: Vishwanathapura, Niranjana <niranjana.vishwanathapura at intel.com>
>> Sent: Friday, September 29, 2023 2:59 PM
>> To: Ruhl, Michael J <michael.j.ruhl at intel.com>
>> Cc: Kershner, David <david.kershner at intel.com>; intel-
>> xe at lists.freedesktop.org; Fleck, John <john.fleck at intel.com>; Roper,
>> Matthew D <matthew.d.roper at intel.com>; Brost, Matthew
>> <matthew.brost at intel.com>; Summers, Stuart <stuart.summers at intel.com>
>> Subject: Re: [PATCH] drm/xe/xe_migrate.c: Use DPA offset for page table
>> entries.
>>
>> On Fri, Sep 29, 2023 at 08:01:47AM -0700, Ruhl, Michael J wrote:
>> >>-----Original Message-----
>> >>From: Vishwanathapura, Niranjana
>> <niranjana.vishwanathapura at intel.com>
>> >>Sent: Thursday, September 28, 2023 9:49 PM
>> >>To: Kershner, David <david.kershner at intel.com>
>> >>Cc: intel-xe at lists.freedesktop.org; Ruhl, Michael J
>> >><michael.j.ruhl at intel.com>; Fleck, John <john.fleck at intel.com>; Roper,
>> >>Matthew D <matthew.d.roper at intel.com>; Brost, Matthew
>> >><matthew.brost at intel.com>; Summers, Stuart
>> <stuart.summers at intel.com>
>> >>Subject: Re: [PATCH] drm/xe/xe_migrate.c: Use DPA offset for page
>> >>table entries.
>> >>
>> >>On Thu, Sep 28, 2023 at 09:48:16AM -0400, David Kershner wrote:
>> >>>Device Physical Address (DPA) is the starting offset device memory.
>> >>>
>> >>>Update xe_migrate identity map base PTE entries to start at dpa_base
>> >>>instead of 0.
>> >>>
>> >>>The VM offset value should be 0 relative instead of DPA relative.
>> >>>
>> >>>Signed-off-by: David Kershner <david.kershner at intel.com>
>> >>>---
>> >>> drivers/gpu/drm/xe/xe_migrate.c | 19 +++++++++++--------
>> >>> 1 file changed, 11 insertions(+), 8 deletions(-)
>> >>>
>> >>>diff --git a/drivers/gpu/drm/xe/xe_migrate.c
>> >>b/drivers/gpu/drm/xe/xe_migrate.c
>> >>>index 9438f609d18b..848cdb6af2e6 100644
>> >>>--- a/drivers/gpu/drm/xe/xe_migrate.c
>> >>>+++ b/drivers/gpu/drm/xe/xe_migrate.c
>> >>>@@ -114,8 +114,10 @@ static u64 xe_migrate_vm_addr(u64 slot, u32
>> level)
>> >>> 	return (slot + 1ULL) << xe_pt_shift(level + 1);  }
>> >>>
>> >>>-static u64 xe_migrate_vram_ofs(u64 addr)
>> >>>+static u64 xe_migrate_vram_ofs(u64 addr, struct xe_tile *tile)
>> >>> {
>> >>>+	if (tile)
>> >>>+		addr -= tile->mem.vram.dpa_base;
>> >>
>> >>See below...
>> >>
>> >>> 	return addr + (256ULL << xe_pt_shift(2)); }
>> >>>
>> >>>@@ -149,7 +151,7 @@ static int xe_migrate_create_cleared_bo(struct
>> >>xe_migrate *m, struct xe_vm *vm)
>> >>>
>> >>> 	xe_map_memset(xe, &m->cleared_bo->vmap, 0, 0x00, cleared_size);
>> >>> 	vram_addr = xe_bo_addr(m->cleared_bo, 0, XE_PAGE_SIZE);
>> >>>-	m->cleared_vram_ofs = xe_migrate_vram_ofs(vram_addr);
>> >>>+	m->cleared_vram_ofs = xe_migrate_vram_ofs(vram_addr, tile);
>> >>>
>> >>> 	return 0;
>> >>> }
>> >>>@@ -223,12 +225,12 @@ static int xe_migrate_prepare_vm(struct xe_tile
>> >>*tile, struct xe_migrate *m,
>> >>> 	} else {
>> >>> 		u64 batch_addr = xe_bo_addr(batch, 0, XE_PAGE_SIZE);
>> >>>
>> >>>-		m->batch_base_ofs = xe_migrate_vram_ofs(batch_addr);
>> >>>+		m->batch_base_ofs = xe_migrate_vram_ofs(batch_addr, tile);
>> >>>
>> >>> 		if (xe->info.supports_usm) {
>> >>> 			batch = tile->primary_gt->usm.bb_pool->bo;
>> >>> 			batch_addr = xe_bo_addr(batch, 0, XE_PAGE_SIZE);
>> >>>-			m->usm_batch_base_ofs =
>> >>xe_migrate_vram_ofs(batch_addr);
>> >>>+			m->usm_batch_base_ofs =
>> >>xe_migrate_vram_ofs(batch_addr, tile);
>> >>> 		}
>> >>> 	}
>> >>>
>> >>>@@ -266,7 +268,9 @@ static int xe_migrate_prepare_vm(struct xe_tile
>> >>>*tile,
>> >>struct xe_migrate *m,
>> >>> 		 * Use 1GB pages, it shouldn't matter the physical amount of
>> >>> 		 * vram is less, when we don't access it.
>> >>> 		 */
>> >>>-		for (pos = 0; pos < xe->mem.vram.actual_physical_size; pos +=
>> >>SZ_1G, ofs += 8)
>> >>>+		for (pos = tile->mem.vram.dpa_base;
>> >>>+		     pos < tile->mem.vram.actual_physical_size + tile-
>> >>>mem.vram.dpa_base;
>> >>>+		     pos += SZ_1G, ofs += 8)
>> >>> 			xe_map_wr(xe, &bo->vmap, ofs, u64, pos | flags);
>> >>
>> >>I think this is incorrect.
>> >>We need to map the whole device LMEM, no just the tile LMEM for each tile.
>> >>This is because we should be able to move the bo from one tile to another.
>> >
>> >My understanding, is that the page table is created for the tile (there
>> >is one per tile), to migrate data to and from the tile.
>> >
>> >How is this different from moving from sysmem to vram? Where is the PTE
>> >info for the sysmem side of this?
>> >
>> >If this is for a general purpose page table, should this be created for
>> >the device, rather than having two page table with the same info?
>> >
>> >I see that the only tests that use this are sysmem -> device and device ->
>> sysmem.
>> >
>> >Is there a use case test that shows a migration from one tile to the
>> >other?  Would like to test this.
>> >
>>
>> We can have xe_migrate_copy() function getting called with same object for
>> src_bo and dst_bo, but different lmem regions for src and dst.
>>
>> We may not have a good IGT, but I can think of VM prefetch
>> (XE_VM_BIND_OP_PREFETCH) and access counters as use cases that can
>> trigger migration of BO between tiles.
>>
>> >>And the identity map I think should start from device dpa
>> >>(xe->mem.vram.dpa_base) instead of 0-based. The xe-
>> >mem.vram.dpa_base
>> >>gets updated based on whats is >programed into PKG_ADDR_BASE.
>> >
>> >This makes sense if there is only one table.  Since the created for
>> >each tile, it will have some over head.
>> >
>> >For instance, the offset for the VMs are currently need the tile offset
>> >included.  The new code does not need that.  If we use the identity
>> >mapping with the full space we will have to break the tile offset out
>> >as a separate value to keep track  (will need to be added per tile..).
>> >
>>
>> I did not fully understand your question. xe_bo_addr() has the dpa_base
>> included (not the tile offset).
>>
>
>I'm confused, xe_bo_addr returns the offset based off of vram_region_gpu_offset,
>which returns the tile based dpa_base, which has the tile_offset in it as well.
>
>Should it be returning xe->mem.vram.dpa_base instead?
>

No. I meant xe_bo_addr() already adds the DPA (not just the offset of tile from
the device LMEM start).

Niranjana

>David Kershner
>
>
>> Niranjana
>>
>> >Thoughts?
>> >
>> >Mike
>> >
>> >>Niranjana
>> >>
>> >>> 	}
>> >>>
>> >>>@@ -441,8 +445,7 @@ static u32 pte_update_size(struct xe_migrate *m,
>> >>> 		cmds += cmd_size;
>> >>> 	} else {
>> >>> 		/* Offset into identity map. */
>> >>>-		*L0_ofs = xe_migrate_vram_ofs(cur->start +
>> >>>-					      vram_region_gpu_offset(res));
>> >>>+		*L0_ofs = xe_migrate_vram_ofs(cur->start, NULL);
>> >>> 		cmds += cmd_size;
>> >>> 	}
>> >>>
>> >>>@@ -1027,7 +1030,7 @@ static void write_pgtable(struct xe_tile *tile,
>> >>>struct
>> >>xe_bb *bb, u64 ppgtt_ofs,
>> >>> 	xe_tile_assert(tile, update->qwords <= 0x1ff);
>> >>> 	if (!ppgtt_ofs) {
>> >>> 		ppgtt_ofs = xe_migrate_vram_ofs(xe_bo_addr(update-
>> >pt_bo,
>> >>0,
>> >>>-							   XE_PAGE_SIZE));
>> >>>+							   XE_PAGE_SIZE), tile);
>> >>> 	}
>> >>>
>> >>> 	do {
>> >>>--
>> >>>2.35.1
>> >>>


More information about the Intel-xe mailing list