[PATCH v6 6/9] drm/xe/vf: Rebase MEMIRQ structures for all contexts after migration
Michał Winiarski
michal.winiarski at intel.com
Wed Jul 16 18:32:51 UTC 2025
On Fri, Jul 04, 2025 at 11:02:25PM +0200, Tomasz Lis wrote:
> All contexts require an update of state data, as the data includes
> GGTT references to memirq-related buffers.
>
> Default contexts need these references updated as well, because they
> are not refreshed when a new context is created from them.
>
> The way we write to vram requires scratch buffer to be used
> before the whole block is memcopied. Since using kalloc() within
> specific recovery functions would lead to unintended relations
> between locks, we are allocating the buffer earlier, before
> any locks are taken. The same buffer will be used for other steps
> of the recovery.
>
> v2: Update addresses by xe_lrc_write_ctx_reg() rather than
> set_memory_based_intr()
> v3: Renamed parameter, reordered parameters in some functs
> v4: Check if have MEMIRQ, move `xe_gt*` funct to proper file
> v5: Revert back to requiring scratch buffer, but allocate it
> earlier this time
>
> Signed-off-by: Tomasz Lis <tomasz.lis at intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
> Cc: Michal Winiarski <michal.winiarski at intel.com>
> Acked-by: Satyanarayana K V P <satyanarayana.k.v.p at intel.com> (v3)
> ---
> drivers/gpu/drm/xe/xe_exec_queue.c | 7 ++--
> drivers/gpu/drm/xe/xe_exec_queue.h | 2 +-
> drivers/gpu/drm/xe/xe_gt_sriov_vf.c | 14 ++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_vf.h | 1 +
> drivers/gpu/drm/xe/xe_guc_submit.c | 5 +--
> drivers/gpu/drm/xe/xe_guc_submit.h | 2 +-
> drivers/gpu/drm/xe/xe_lrc.c | 52 +++++++++++++++++++++++++++--
> drivers/gpu/drm/xe/xe_lrc.h | 4 +++
> drivers/gpu/drm/xe/xe_sriov_vf.c | 20 +++++++++--
> 9 files changed, 97 insertions(+), 10 deletions(-)
>
[...]
> diff --git a/drivers/gpu/drm/xe/xe_sriov_vf.c b/drivers/gpu/drm/xe/xe_sriov_vf.c
> index 54a6218e48b0..43ac73e432d4 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_vf.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_vf.c
> @@ -13,6 +13,7 @@
> #include "xe_guc_ct.h"
> #include "xe_guc_submit.h"
> #include "xe_irq.h"
> +#include "xe_lrc.h"
> #include "xe_pm.h"
> #include "xe_sriov.h"
> #include "xe_sriov_printk.h"
> @@ -244,6 +245,11 @@ static int vf_get_next_migrated_gt_id(struct xe_device *xe)
> return -1;
> }
>
> +static size_t post_migration_scratch_size(struct xe_device *xe)
> +{
> + return xe_lrc_reg_size(xe);
> +}
> +
> /**
> * Perform post-migration fixups on a single GT.
> *
> @@ -260,19 +266,29 @@ static int vf_get_next_migrated_gt_id(struct xe_device *xe)
> static int gt_vf_post_migration_fixups(struct xe_gt *gt)
> {
> s64 shift;
> + void *buf;
> int err;
>
> + buf = kmalloc(post_migration_scratch_size(gt_to_xe(gt)), GFP_KERNEL);
> + if (!buf)
> + return -ENOMEM;
> +
I would suggest allocating this temporary buffer at the place where it's
actually needed, rather than dragging this top-level allocation through
the entire call-stack. Yes, when done this way we can reuse it, no, the
reuse doesn't really give us anything, as the buffer is small, and
kmalloc is relatively fast.
Other than that - LGTM.
Reviewed-by: Michał Winiarski <michal.winiarski at intel.com>
Thanks,
-Michał
> err = xe_gt_sriov_vf_query_config(gt);
> - if (err)
> + if (err) {
> + kfree(buf);
> return err;
> + }
>
> shift = xe_gt_sriov_vf_ggtt_shift(gt);
> if (shift) {
> xe_tile_sriov_vf_fixup_ggtt_nodes(gt_to_tile(gt), shift);
> - xe_guc_contexts_hwsp_rebase(>->uc.guc);
> + xe_gt_sriov_vf_default_lrcs_hwsp_rebase(gt);
> + xe_guc_contexts_hwsp_rebase(>->uc.guc, buf);
> /* FIXME: add the recovery steps */
> xe_guc_ct_fixup_messages_with_ggtt(>->uc.guc.ct, shift);
> }
> +
> + kfree(buf);
> return 0;
> }
>
> --
> 2.25.1
>
More information about the Intel-xe
mailing list