[PATCH v3 6/7] drm/xe/vf: Rebase MEMIRQ structures for all contexts after migration
Lis, Tomasz
tomasz.lis at intel.com
Thu May 29 01:19:55 UTC 2025
On 28.05.2025 12:44, Michał Winiarski wrote:
> On Tue, May 20, 2025 at 01:19:24AM +0200, Tomasz Lis wrote:
>> All contexts require an update of state data, as the data includes
>> GGTT references to memirq-related buffers.
>>
>> Default contexts need these references updated as well, because they
>> are not refreshed when a new context is created from them.
>>
>> v2: Update addresses by xe_lrc_write_ctx_reg() rather than
>> set_memory_based_intr()
>> v3: Renamed parameter, reordered parameters in some functs
>>
>> Signed-off-by: Tomasz Lis<tomasz.lis at intel.com>
>> Cc: Michal Wajdeczko<michal.wajdeczko at intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_exec_queue.c | 4 +++-
>> drivers/gpu/drm/xe/xe_lrc.c | 35 ++++++++++++++++++++++++++++++
>> drivers/gpu/drm/xe/xe_lrc.h | 2 ++
>> drivers/gpu/drm/xe/xe_sriov_vf.c | 13 ++++++++++-
>> 4 files changed, 52 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>> index d696c8410a32..9c3e568400e0 100644
>> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
>> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>> @@ -1051,6 +1051,8 @@ void xe_exec_queue_contexts_hwsp_rebase(struct xe_exec_queue *q)
>> {
>> int i;
>>
>> - for (i = 0; i < q->width; ++i)
>> + for (i = 0; i < q->width; ++i) {
>> + xe_lrc_update_memirq_regs_with_address(q->lrc[i], q->hwe);
> What if we're not using memirq?
We currently do not support VF Migration with Xe driver on any platform
without memory-based IRQs.
But will add `xe_device_uses_memirq()` condition - MMIO IRQs should not
need any additional code within VF recovery (we're already re-enabling
IRQs during kickstart), so that's the only thing required for the support.
>
>> xe_lrc_update_hwctx_regs_with_address(q->lrc[i]);
>> + }
>> }
>> diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
>> index 525565480aef..959ac9c5d39a 100644
>> --- a/drivers/gpu/drm/xe/xe_lrc.c
>> +++ b/drivers/gpu/drm/xe/xe_lrc.c
>> @@ -898,6 +898,41 @@ static void *empty_lrc_data(struct xe_hw_engine *hwe)
>> return data;
>> }
>>
>> +/**
>> + * xe_default_lrc_update_memirq_regs_with_address - Re-compute GGTT references in default LRC
>> + * of given engine.
>> + * @hwe: the &xe_hw_engine struct instance
>> + */
>> +void xe_default_lrc_update_memirq_regs_with_address(struct xe_hw_engine *hwe)
>> +{
>> + struct xe_gt *gt = hwe->gt;
>> + u32 *regs;
>> +
>> + if (!gt->default_lrc[hwe->class])
>> + return;
>> +
>> + regs = gt->default_lrc[hwe->class] + LRC_PPHWSP_SIZE;
>> + set_memory_based_intr(regs, hwe);
> We're using set_memory_based_intr() for gt->default_lrc, and
> xe_lrc_update_memirq_regs_with_address() for q->lrc.
>
> Why do we need 2 different methods to do that?
Real context state may be in LMEM, while default is just a CPU-side buffer.
In Xe, we use iosys_map struct to write shared memory, and we have
wrappers over the kernel functions to handle write/read from instances
of that struct.
Therefore, we cannot reuse `set_memory_based_intr()` for real LRCs.
Well, we could allocate a cpu-side buffer, copy data to it, call the
`set_memory_based_intr()` and the copy again. This is actually what I
did originally - but it was problematic because required `kalloc()` so
it got changed on previous round of review.
Going opposite way, we could use
`structiosys_mapmap=IOSYS_MAP_INIT_VADDR(regs);` and use the
`xe_lrc_update_memirq_regs_with_address()` on both real and default LRC.
But that would require getting rid of `xe_lrc_write_ctx_reg()` reuse,
and it wouldn't really eliminate any code - `set_memory_based_intr()`
still have to stay.
So - yes, we use two different methods, but only one is implemented in
this patch.
>> +}
>> +
>> +/**
>> + * xe_lrc_update_memirq_regs_with_address - Re-compute GGTT references in mem interrupt data
>> + * for given LRC.
>> + * @lrc: the &xe_lrc struct instance
>> + * @hwe: the &xe_hw_engine struct instance
>> + */
>> +void xe_lrc_update_memirq_regs_with_address(struct xe_lrc *lrc, struct xe_hw_engine *hwe)
>> +{
>> + struct xe_memirq *memirq = >_to_tile(hwe->gt)->memirq;
>> +
>> + xe_lrc_write_ctx_reg(lrc, CTX_INT_MASK_ENABLE_PTR,
>> + xe_memirq_enable_ptr(memirq));
>> + xe_lrc_write_ctx_reg(lrc, CTX_INT_STATUS_REPORT_PTR,
>> + xe_memirq_status_ptr(memirq, hwe));
>> + xe_lrc_write_ctx_reg(lrc, CTX_INT_SRC_REPORT_PTR,
>> + xe_memirq_source_ptr(memirq, hwe));
>> +}
>> +
>> static void xe_lrc_set_ppgtt(struct xe_lrc *lrc, struct xe_vm *vm)
>> {
>> u64 desc = xe_vm_pdp4_descriptor(vm, gt_to_tile(lrc->gt));
>> diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>> index e7a99cfd0abe..801a6b943f6e 100644
>> --- a/drivers/gpu/drm/xe/xe_lrc.h
>> +++ b/drivers/gpu/drm/xe/xe_lrc.h
>> @@ -89,6 +89,8 @@ u32 xe_lrc_indirect_ring_ggtt_addr(struct xe_lrc *lrc);
>> u32 xe_lrc_ggtt_addr(struct xe_lrc *lrc);
>> u32 *xe_lrc_regs(struct xe_lrc *lrc);
>> void xe_lrc_update_hwctx_regs_with_address(struct xe_lrc *lrc);
>> +void xe_default_lrc_update_memirq_regs_with_address(struct xe_hw_engine *hwe);
>> +void xe_lrc_update_memirq_regs_with_address(struct xe_lrc *lrc, struct xe_hw_engine *hwe);
>>
>> u32 xe_lrc_read_ctx_reg(struct xe_lrc *lrc, int reg_nr);
>> void xe_lrc_write_ctx_reg(struct xe_lrc *lrc, int reg_nr, u32 val);
>> diff --git a/drivers/gpu/drm/xe/xe_sriov_vf.c b/drivers/gpu/drm/xe/xe_sriov_vf.c
>> index 0f0d1a97ae1d..0a9761b6ffb5 100644
>> --- a/drivers/gpu/drm/xe/xe_sriov_vf.c
>> +++ b/drivers/gpu/drm/xe/xe_sriov_vf.c
>> @@ -225,13 +225,24 @@ static int vf_post_migration_requery_guc(struct xe_device *xe)
>> return ret;
>> }
>>
>> +static void xe_gt_default_lrcs_hwsp_rebase(struct xe_gt *gt)
>> +{
>> + struct xe_hw_engine *hwe;
>> + enum xe_hw_engine_id id;
>> +
>> + for_each_hw_engine(hwe, gt, id)
>> + xe_default_lrc_update_memirq_regs_with_address(hwe);
>> +}
> Device-level functions live in xe_sriov_vf.c, GT-level functions live in
> xe_gt_sriov_vf.c
ack
-Tomasz
> Thanks,
> -Michał
>
>> +
>> static void vf_post_migration_fixup_contexts(struct xe_device *xe)
>> {
>> struct xe_gt *gt;
>> unsigned int id;
>>
>> - for_each_gt(gt, xe, id)
>> + for_each_gt(gt, xe, id) {
>> + xe_gt_default_lrcs_hwsp_rebase(gt);
>> xe_guc_contexts_hwsp_rebase(>->uc.guc);
>> + }
>> }
>>
>> static void vf_post_migration_fixup_ctb(struct xe_device *xe)
>> --
>> 2.25.1
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-xe/attachments/20250529/f80f8786/attachment-0001.htm>
More information about the Intel-xe
mailing list