[PATCH 1/1] drm/xe: serialize store_data and user_interrupt for ufence wait
fei.yang at intel.com
fei.yang at intel.com
Tue Aug 12 18:28:46 UTC 2025
From: Fei Yang <fei.yang at intel.com>
Quote BSpec, MI_STORE_DATA_IMM "simply initiates the write operation with
command execution proceeding normally. Although the write operation is
guaranteed to complete eventually, there is no mechanism to synchronize
command execution with the completion (or even initiation) of these
operations."
The KMD currently emit MI_STORE_DATA_IMM and MI_USER_INTERRUPT consecutively
to implement user fence. However, according to the BSpec, the data write
operation is not guaranteed to be completed when triggering the interrupt,
that would cause the xe_wait_user_fence_ioctl to wait until the full user
specified timeout is reached before checking the fence value again. Great
performance degradation has been observed in IGT xe_exec_fault_mode test
cases due to this unnecessary wait. The worst case is that if user set the
timeout to MAX_INT32, the wait could end up being a hang until some other
random program triggers a user interrupt to wake it up.
A semaphore wait is added right after the data write to avoid the unexpected
wait.
Signed-off-by: Fei Yang <fei.yang at intel.com>
---
drivers/gpu/drm/xe/instructions/xe_mi_commands.h | 11 +++++++++++
drivers/gpu/drm/xe/xe_ring_ops.c | 14 ++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
index c47b290e0e9f..1c9e7b35c665 100644
--- a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
+++ b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
@@ -34,6 +34,17 @@
#define MI_FORCE_WAKEUP __MI_INSTR(0x1D)
#define MI_MATH(n) (__MI_INSTR(0x1A) | XE_INSTR_NUM_DW((n) + 1))
+#define MI_SEMAPHORE_WAIT (__MI_INSTR(0x1c) | XE_INSTR_NUM_DW(5))
+#define MI_SEMAPHORE_REGISTER_POLL REG_BIT(16)
+#define MI_SEMAPHORE_POLL REG_BIT(15)
+#define MI_SEMAPHORE_COMP_OP GENMASK(14, 12)
+#define MI_SEMAPHORE_SAD_GT_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 0)
+#define MI_SEMAPHORE_SAD_GTE_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 1)
+#define MI_SEMAPHORE_SAD_LT_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 2)
+#define MI_SEMAPHORE_SAD_LTE_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 3)
+#define MI_SEMAPHORE_SAD_EQ_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 4)
+#define MI_SEMAPHORE_SAD_NEQ_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 5)
+
#define MI_STORE_DATA_IMM __MI_INSTR(0x20)
#define MI_SDI_GGTT REG_BIT(22)
#define MI_SDI_LEN_DW GENMASK(9, 0)
diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
index 5f15360d14bf..189e764e3914 100644
--- a/drivers/gpu/drm/xe/xe_ring_ops.c
+++ b/drivers/gpu/drm/xe/xe_ring_ops.c
@@ -169,6 +169,20 @@ static int emit_store_imm_ppgtt_posted(u64 addr, u64 value,
dw[i++] = upper_32_bits(addr);
dw[i++] = lower_32_bits(value);
dw[i++] = upper_32_bits(value);
+ dw[i++] = MI_SEMAPHORE_WAIT |
+ MI_SEMAPHORE_POLL |
+ MI_SEMAPHORE_SAD_EQ_SDD;
+ dw[i++] = lower_32_bits(value);
+ dw[i++] = lower_32_bits(addr);
+ dw[i++] = upper_32_bits(addr);
+ dw[i++] = 0;
+ dw[i++] = MI_SEMAPHORE_WAIT |
+ MI_SEMAPHORE_POLL |
+ MI_SEMAPHORE_SAD_EQ_SDD;
+ dw[i++] = upper_32_bits(value);
+ dw[i++] = lower_32_bits(addr + 4);
+ dw[i++] = upper_32_bits(addr);
+ dw[i++] = 0;
return i;
}
--
2.43.0
More information about the Intel-xe
mailing list