<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 01.04.2025 19:24, Michal Wajdeczko
wrote:<br>
</div>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<pre wrap="" class="moz-quote-pre">please use "drm/xe/vf:" in the subject as this patch is still more VF
oriented, then general SRIOV</pre>
</blockquote>
ok.<br>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<pre wrap="" class="moz-quote-pre">On 31.03.2025 15:21, Tomasz Lis wrote:
</pre>
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">We have only one GGTT for all IOV functions, with each VF having assigned
a range of addresses for its use. After migration, a VF can receive a
different range of addresses than it had initially.
This implements shifting GGTT addresses within drm_mm nodes, so that
VMAs stay valid after migration. This will make the driver use new
addresses when accessing GGTT from the moment the shifting ends.
By taking the ggtt->lock for the period of VMA fixups, this change
also adds constraint on that mutex. Any locks used during the recovery
cannot ever wait for hardware response - because after migration,
the hardware will not do anything until fixups are finished.
v2: Moved some functs to xe_ggtt.c; moved shift computation to just
after querying; improved documentation; switched some warns to asserts;
skipping fixups when GGTT shift eq 0; iterating through tiles (Michal)
v3: Updated kerneldocs, removed unused funct, properly allocate
balloning nodes if non existent
v4: Re-used ballooning functions from VF init, used bool in place of
standard error codes
v5: Renamed one function
Signed-off-by: Tomasz Lis <a class="moz-txt-link-rfc2396E" href="mailto:tomasz.lis@intel.com"><tomasz.lis@intel.com></a>
---
drivers/gpu/drm/xe/xe_ggtt.c | 33 +++++++++
drivers/gpu/drm/xe/xe_ggtt.h | 1 +
drivers/gpu/drm/xe/xe_gt_sriov_vf.c | 83 +++++++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_vf.h | 1 +
drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h | 2 +
drivers/gpu/drm/xe/xe_sriov_vf.c | 17 +++++
6 files changed, 137 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index 2d7456e37ef4..b13c4a12393e 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -482,6 +482,39 @@ void xe_ggtt_node_remove_balloon(struct xe_ggtt_node *node)
drm_mm_remove_node(&node->base);
}
+/**
+ * xe_ggtt_shift_mm_nodes - Shift GGTT nodes to adjust for a change in usable address range.
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">drop "mm" from the function name as it is implementation detail
xe_ggtt_shift_nodes()</pre>
</blockquote>
<p>The fact that nodes are used for Memory Management is an
implementation detail?</p>
<p>I will remove `mm` from the name, because no other function uses
that shortcut. But let's stick</p>
<p>to real arguments, ie. that no other function dealing with nodes
has the _mm_ in name.<br>
</p>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+ * @ggtt: the &xe_ggtt struct instance
+ * @shift: change to the location of area provisioned for current VF
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">this function is quite generic and while it is used by the VF only, the
parameter doesn't have to be described as such</pre>
</blockquote>
I just changed the subject line because (from comment above) the
patch is VF oriented.<br>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<pre wrap="" class="moz-quote-pre">you may add here, in the 'longer description' part of the kernel-doc,
after explaining how it works, that function is mostly used by the VF,
and why we believe it will succeed</pre>
</blockquote>
<p>Not "mostly" but "only". There is no other scenario where we
could anticipate this function to be used.</p>
<p>While universalization is generally a good direction, pretending
that this code will ever be used for anything else than VF is not
a direction with any chance for future benefits.</p>
<p>For the comment about errors - will add.<br>
</p>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+ */
+void xe_ggtt_shift_mm_nodes(struct xe_ggtt *ggtt, s64 shift)
+{
+ struct drm_mm_node *node, *tmpn;
+ LIST_HEAD(temp_list_head);
+ int err;
+
+ lockdep_assert_held(&ggtt->lock);
+
+ /*
+ * Move nodes, from range previously assigned to this VF, to a temp list.
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">nit: no need for multi-line comment style
also, maybe it could be moved to the 'longer description' part of the
function kernel-doc, after sanitize it from the 'VF' specific wording
</pre>
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+ */
+ drm_mm_for_each_node_safe(node, tmpn, &ggtt->mm) {
+ drm_mm_remove_node(node);
+ list_add(&node->node_list, &temp_list_head);
+ }
+
+ /*
+ * Now the GGTT VM contains no nodes. We can re-add all VF nodes with
+ * shifted offsets.
+ */
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">also consider to move this comment to function kernel-doc</pre>
</blockquote>
<p>Not sure why we want a detail on implementation in the function
description.</p>
<p>Also not sure why we want detailed description block in this
function, when we already have that in the caller.<br>
</p>
<p>But I don't mind, will move.</p>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+ list_for_each_entry_safe(node, tmpn, &temp_list_head, node_list) {
+ list_del(&node->node_list);
+ node->start += shift;
+ err = drm_mm_reserve_node(&ggtt->mm, node);
+ xe_tile_assert(ggtt->tile, !err);
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">while we believe it should be always possible to 'shift' all nodes, as
we just released our balloons, I'm not sure that this assert here alone
is sufficient
maybe before starting any movements, check that 'shift' is valid, and
add asserts for each node that shifted location is within GGTT space?</pre>
</blockquote>
<p>Before starting? How exactly do we check the shift before we have
any nodes? Without nodes, we have nothing to compare to.<br>
</p>
<p>I assume that what you're going for, is just to add more asserts
on each node. Will do.<br>
</p>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+ }
+}
+
/**
* xe_ggtt_node_insert_locked - Locked version to insert a &xe_ggtt_node into the GGTT
* @node: the &xe_ggtt_node to be inserted
diff --git a/drivers/gpu/drm/xe/xe_ggtt.h b/drivers/gpu/drm/xe/xe_ggtt.h
index 27e7d67de004..a07194cd3724 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.h
+++ b/drivers/gpu/drm/xe/xe_ggtt.h
@@ -18,6 +18,7 @@ void xe_ggtt_node_fini(struct xe_ggtt_node *node);
int xe_ggtt_node_insert_balloon(struct xe_ggtt_node *node,
u64 start, u64 size);
void xe_ggtt_node_remove_balloon(struct xe_ggtt_node *node);
+void xe_ggtt_shift_mm_nodes(struct xe_ggtt *ggtt, s64 shift);
int xe_ggtt_node_insert(struct xe_ggtt_node *node, u32 size, u32 align);
int xe_ggtt_node_insert_locked(struct xe_ggtt_node *node,
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
index 9edbe34f45f4..e9e7ddeb4254 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
@@ -415,6 +415,7 @@ static int vf_get_ggtt_info(struct xe_gt *gt)
xe_gt_sriov_dbg_verbose(gt, "GGTT %#llx-%#llx = %lluK\n",
start, start + size - 1, size / SZ_1K);
+ config->ggtt_shift = start - (s64)config->ggtt_base;
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">on the first probe, shouldn't we store it as '0' ?</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">`ggtt_shift` is a part of <span style="white-space: normal">`xe_gt`, allocated in `xe_gt_alloc()` using `kzalloc` variant. It is always zero on probe, regardless if first or not.
But even regardless of that, the value of this parameter is valid only when within the post-migration recovery worker. It is not accessed outside, and noone cares about its value outside.
</span></pre>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre"> config->ggtt_base = start;
config->ggtt_size = size;
@@ -972,6 +973,88 @@ int xe_gt_sriov_vf_query_runtime(struct xe_gt *gt)
return err;
}
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">hmm, are you sure that place between
xe_gt_sriov_vf_query_runtime
and
vf_runtime_reg_cmp
is the best place for this function?
maybe at least place it closer to
xe_gt_sriov_vf_migrated_event_handler
that is at least partially related,
but not sure if also not miss-located</pre>
</blockquote>
no problem, will place it there.<br>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">
+/**
+ * xe_gt_sriov_vf_fixup_ggtt_nodes - Shift GGTT allocations to match assigned range.
+ * @gt: the &xe_gt struct instance
+ * Return: True if fixups are necessary
+ *
+ * Since Global GTT is not virtualized, each VF has an assigned range
+ * within the global space. This range might have changed during migration,
+ * which requires all memory addresses pointing to GGTT to be shifted.
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">This 'longer description' shall be _after_ 'Return' description
see <a class="moz-txt-link-freetext" href="https://docs.kernel.org/doc-guide/kernel-doc.html#function-documentation">https://docs.kernel.org/doc-guide/kernel-doc.html#function-documentation</a></pre>
</blockquote>
I see from the link than you meant `Return:` at the end. But I
agreed to your comments below so the return vanishes completely.<br>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+ */
+bool xe_gt_sriov_vf_fixup_ggtt_nodes(struct xe_gt *gt)
+{
+ struct xe_gt_sriov_vf_selfconfig *config = >->sriov.vf.self_config;
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">you should introduce xe_gt_sriov_vf_ggtt_shift() earlier than patch 4/4
and use it here (or better let the caller use it and pass ggtt_shift as
parameter)</pre>
</blockquote>
will move that function earlier.<br>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+ struct xe_tile *tile = gt_to_tile(gt);
+ struct xe_ggtt *ggtt = tile->mem.ggtt;
+ s64 ggtt_shift;
+
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">assert that gt is not a media one</pre>
</blockquote>
ok<br>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+ mutex_lock(&ggtt->lock);
+ ggtt_shift = config->ggtt_shift;
+ /*
+ * Move nodes, including balloons, from range previously assigned to this VF,
+ * into newly provisioned area.
+ *
+ * The balloon nodes are there to eliminate unavailable ranges from use: one
+ * reserves the GGTT area below the range for current VF, and another one
+ * reserves area above.
+ *
+ * Below is a GGTT layout of example VF, with a certain address range assigned to
+ * said VF, and inaccessible areas above and below:
+ *
+ * 0 4GiB
+ * |<--------------------------- Total GGTT size ----------------------------->|
+ * WOPCM GUC_TOP
+ * |<-------------- Area mappable by xe_ggtt instance ---------------->|
+ *
+ * +---+---------------------------------+----------+----------------------+---+
+ * |\\\|/////////////////////////////////| VF mem |//////////////////////|\\\|
+ * +---+---------------------------------+----------+----------------------+---+
+ *
+ * Hardware enforced access rules before migration:
+ *
+ * |<------- inaccessible for VF ------->|<VF owned>|<-- inaccessible for VF ->|
+ *
+ * drm_mm nodes used for tracking allocations:
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">use of drm_mm is implementation detail of the xe_ggtt and it is not
relevant here, just say 'GGTT nodes'</pre>
</blockquote>
<p>We are trying to hide implementation details. Not because these
details can vary/change - this will never happen.</p>
<p>We're just calling some things implementation details and hiding
them because it is doable.<br>
</p>
<p>We've talked about this before, so will disagree but commit.<br>
</p>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+ *
+ * |<----------- balloon ------------>|<- nodes->|<----- balloon ------>|
+ *
+ * After the migration, GGTT area assigned to the VF might have shifted, either
+ * to lower or to higher address. But we expect the total size and extra areas to
+ * be identical, as migration can only happen between matching platforms.
+ * Below is an example of GGTT layout of the VF after migration. Content of the
+ * GGTT for VF has been moved to a new area, and we receive its address from GuC:
+ *
+ * +---+----------------------+----------+---------------------------------+---+
+ * |\\\|//////////////////////| VF mem |/////////////////////////////////|\\\|
+ * +---+----------------------+----------+---------------------------------+---+
+ *
+ * Hardware enforced access rules after migration:
+ *
+ * |<- inaccessible for VF -->|<VF owned>|<------- inaccessible for VF ------->|
+ *
+ * So the VF has a new slice of GGTT assigned, and during migration process, the
+ * memory content was copied to that new area. But the drm_mm nodes within xe kmd
+ * are still tracking allocations using the old addresses. The nodes within VF
+ * owned area have to be shifted, and balloon nodes need to be resized to
+ * properly mask out areas not owned by the VF.
+ *
+ * Fixed drm_mm nodes used for tracking allocations:
+ *
+ * |<------ balloon ------>|<- nodes->|<----------- balloon ----------->|
+ *
+ * Due to use of GPU profiles, we do not expect the old and new GGTT ares to
+ * overlap; but our node shifting will fix addresses properly regardless.
+ */
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">this inline comment is now much bigger than actual implementation
why not promote it to the full DOC: as then it could be included in the
master SRIOV documentation</pre>
</blockquote>
<p> will do.</p>
<p>..but for this case only. I will oppose to any tries of having
the architecture docs rewritten as code comments.</p>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+ if (ggtt_shift) {
+ xe_gt_sriov_vf_deballoon_ggtt(gt);
+ xe_ggtt_shift_mm_nodes(ggtt, ggtt_shift);
+ xe_gt_sriov_vf_balloon_ggtt(gt);
+ }
+ mutex_unlock(&ggtt->lock);
+ return ggtt_shift != 0;
+}
+
static int vf_runtime_reg_cmp(const void *a, const void *b)
{
const struct vf_runtime_reg *ra = a;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
index c87b0e9c7ebc..13c04e313aa6 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
@@ -20,6 +20,7 @@ int xe_gt_sriov_vf_query_runtime(struct xe_gt *gt);
int xe_gt_sriov_vf_prepare_ggtt(struct xe_gt *gt);
int xe_gt_sriov_vf_balloon_ggtt(struct xe_gt *gt);
void xe_gt_sriov_vf_deballoon_ggtt(struct xe_gt *gt);
+bool xe_gt_sriov_vf_fixup_ggtt_nodes(struct xe_gt *gt);
int xe_gt_sriov_vf_notify_resfix_done(struct xe_gt *gt);
void xe_gt_sriov_vf_migrated_event_handler(struct xe_gt *gt);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
index a57f13b5afcd..5ccbdf8d08b6 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
@@ -40,6 +40,8 @@ struct xe_gt_sriov_vf_selfconfig {
u64 ggtt_base;
/** @ggtt_size: assigned size of the GGTT region. */
u64 ggtt_size;
+ /** @ggtt_shift: difference in ggtt_base on last migration */
+ s64 ggtt_shift;
/** @lmem_size: assigned size of the LMEM. */
u64 lmem_size;
/** @num_ctxs: assigned number of GuC submission context IDs. */
diff --git a/drivers/gpu/drm/xe/xe_sriov_vf.c b/drivers/gpu/drm/xe/xe_sriov_vf.c
index c1275e64aa9c..2eb6b8d8a217 100644
--- a/drivers/gpu/drm/xe/xe_sriov_vf.c
+++ b/drivers/gpu/drm/xe/xe_sriov_vf.c
@@ -7,6 +7,7 @@
#include "xe_assert.h"
#include "xe_device.h"
+#include "xe_gt.h"
#include "xe_gt_sriov_printk.h"
#include "xe_gt_sriov_vf.h"
#include "xe_pm.h"
@@ -170,6 +171,20 @@ static bool vf_post_migration_imminent(struct xe_device *xe)
work_pending(&xe->sriov.vf.migration.worker);
}
+static bool vf_post_migration_fixup_ggtt_nodes(struct xe_device *xe)
+{
+ struct xe_tile *tile;
+ unsigned int id;
+ bool need_fixups = false;
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre">try to define vars in reverse-xmas-tree order
bool need_fixups = false;
struct xe_tile *tile;
unsigned int id;</pre>
</blockquote>
ok<br>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+
+ for_each_tile(tile, xe, id) {
+ struct xe_gt *gt = tile->primary_gt;
</pre>
</blockquote>
<pre wrap="" class="moz-quote-pre"> shift = xe_gt_sriov_vf_ggtt_shift(gt);
if (shift) {
need_fixups = true;
xe_gt_sriov_vf_fixup_ggtt_nodes(gt, shift);
}</pre>
</blockquote>
<p>So in this specific place we want to make the function look
longer and reveal a detail.</p>
<p>Makes no sense to me, but since you don't seem convinced by
previous review round - will do this.</p>
<p>-Tomasz<br>
</p>
<blockquote type="cite" cite="mid:69797928-bf34-4f1f-b38f-dd68427d9fea@intel.com">
<blockquote type="cite">
<pre wrap="" class="moz-quote-pre">+
+ need_fixups |= xe_gt_sriov_vf_fixup_ggtt_nodes(gt);
+ }
+ return need_fixups;
+}
+
/*
* Notify all GuCs about resource fixups apply finished.
*/
@@ -191,6 +206,7 @@ static void vf_post_migration_notify_resfix_done(struct xe_device *xe)
static void vf_post_migration_recovery(struct xe_device *xe)
{
+ bool need_fixups;
int err;
drm_dbg(&xe->drm, "migration recovery in progress\n");
@@ -201,6 +217,7 @@ static void vf_post_migration_recovery(struct xe_device *xe)
if (unlikely(err))
goto fail;
+ need_fixups = vf_post_migration_fixup_ggtt_nodes(xe);
/* FIXME: add the recovery steps */
vf_post_migration_notify_resfix_done(xe);
xe_pm_runtime_put(xe);
</pre>
</blockquote>
</blockquote>
</body>
</html>