[PATCH 1/1] drm/xe: serialize store_data and user_interrupt for ufence wait
Matthew Brost
matthew.brost at intel.com
Tue Aug 12 20:07:28 UTC 2025
On Tue, Aug 12, 2025 at 11:28:46AM -0700, fei.yang at intel.com wrote:
> From: Fei Yang <fei.yang at intel.com>
>
> Quote BSpec, MI_STORE_DATA_IMM "simply initiates the write operation with
> command execution proceeding normally. Although the write operation is
> guaranteed to complete eventually, there is no mechanism to synchronize
> command execution with the completion (or even initiation) of these
> operations."
Can we not just use update emit_store_imm_ppgtt_posted to use
MI_FLUSH_IMM + a PPGTT address? I think we can get rid of prior flush
added in 3ad7d18c5dad7?
According to bspec 45725 for MI_FLUSH_IMM:
'Usage note: After this command is completed with a Store DWord enabled,
CPU access to graphics memory willbe coherent (assuming the Render Cache
flush is not inhibited).'
Matt
> The KMD currently emit MI_STORE_DATA_IMM and MI_USER_INTERRUPT consecutively
> to implement user fence. However, according to the BSpec, the data write
> operation is not guaranteed to be completed when triggering the interrupt,
> that would cause the xe_wait_user_fence_ioctl to wait until the full user
> specified timeout is reached before checking the fence value again. Great
> performance degradation has been observed in IGT xe_exec_fault_mode test
> cases due to this unnecessary wait. The worst case is that if user set the
> timeout to MAX_INT32, the wait could end up being a hang until some other
> random program triggers a user interrupt to wake it up.
> A semaphore wait is added right after the data write to avoid the unexpected
> wait.
>
> Signed-off-by: Fei Yang <fei.yang at intel.com>
> ---
> drivers/gpu/drm/xe/instructions/xe_mi_commands.h | 11 +++++++++++
> drivers/gpu/drm/xe/xe_ring_ops.c | 14 ++++++++++++++
> 2 files changed, 25 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> index c47b290e0e9f..1c9e7b35c665 100644
> --- a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> +++ b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> @@ -34,6 +34,17 @@
> #define MI_FORCE_WAKEUP __MI_INSTR(0x1D)
> #define MI_MATH(n) (__MI_INSTR(0x1A) | XE_INSTR_NUM_DW((n) + 1))
>
> +#define MI_SEMAPHORE_WAIT (__MI_INSTR(0x1c) | XE_INSTR_NUM_DW(5))
> +#define MI_SEMAPHORE_REGISTER_POLL REG_BIT(16)
> +#define MI_SEMAPHORE_POLL REG_BIT(15)
> +#define MI_SEMAPHORE_COMP_OP GENMASK(14, 12)
> +#define MI_SEMAPHORE_SAD_GT_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 0)
> +#define MI_SEMAPHORE_SAD_GTE_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 1)
> +#define MI_SEMAPHORE_SAD_LT_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 2)
> +#define MI_SEMAPHORE_SAD_LTE_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 3)
> +#define MI_SEMAPHORE_SAD_EQ_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 4)
> +#define MI_SEMAPHORE_SAD_NEQ_SDD REG_FIELD_PREP(MI_SEMAPHORE_COMP_OP, 5)
> +
> #define MI_STORE_DATA_IMM __MI_INSTR(0x20)
> #define MI_SDI_GGTT REG_BIT(22)
> #define MI_SDI_LEN_DW GENMASK(9, 0)
> diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
> index 5f15360d14bf..189e764e3914 100644
> --- a/drivers/gpu/drm/xe/xe_ring_ops.c
> +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
> @@ -169,6 +169,20 @@ static int emit_store_imm_ppgtt_posted(u64 addr, u64 value,
> dw[i++] = upper_32_bits(addr);
> dw[i++] = lower_32_bits(value);
> dw[i++] = upper_32_bits(value);
> + dw[i++] = MI_SEMAPHORE_WAIT |
> + MI_SEMAPHORE_POLL |
> + MI_SEMAPHORE_SAD_EQ_SDD;
> + dw[i++] = lower_32_bits(value);
> + dw[i++] = lower_32_bits(addr);
> + dw[i++] = upper_32_bits(addr);
> + dw[i++] = 0;
> + dw[i++] = MI_SEMAPHORE_WAIT |
> + MI_SEMAPHORE_POLL |
> + MI_SEMAPHORE_SAD_EQ_SDD;
> + dw[i++] = upper_32_bits(value);
> + dw[i++] = lower_32_bits(addr + 4);
> + dw[i++] = upper_32_bits(addr);
> + dw[i++] = 0;
>
> return i;
> }
> --
> 2.43.0
>
More information about the Intel-xe
mailing list