[Intel-xe] [PATCH 1/6] drm/xe: Make MI_FLUSH_DW immediate size more explicit
Lucas De Marchi
lucas.demarchi at intel.com
Thu Oct 12 19:49:58 UTC 2023
On Wed, Oct 11, 2023 at 04:09:59PM -0700, Matt Roper wrote:
>Despite its name, MI_FLUSH_DW instruction can write an immediate value
>of either dword size or qword size, depending on the 'length' field of
>the instruction. Since "length" excludes the first two dwords of the
>instruction, a value of 2 in the length field implies a dword write and
>a value of 3 implies a qword write. Even in cases where the flush
>instruction's post-sync operation is set to "no write" we're still
>expected to size the overall instruction as if we were doing a dword or
>qword write (i.e., a length of 1 shouldn't be used on modern platforms).
>
>Rather than baking a size of "1" into the #define and then adding
>another unexplained "+ 1" at all the spots where the definition gets
>used, lets just create MI_FLUSH_IMM_DW and MI_FLUSH_IMM_QW definitions
>that should be OR'd into the instruction header to make it more explicit
>what behavior we're requesting.
>
>Bspec: 60229
>Signed-off-by: Matt Roper <matthew.d.roper at intel.com>
>---
> drivers/gpu/drm/xe/regs/xe_gpu_commands.h | 5 ++++-
> drivers/gpu/drm/xe/xe_ring_ops.c | 10 +++++-----
> 2 files changed, 9 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h
>index 21738281bdd0..9432a960346b 100644
>--- a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h
>+++ b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h
>@@ -30,12 +30,15 @@
> #define MI_LRI_MMIO_REMAP_EN REG_BIT(17)
> #define MI_LRI_FORCE_POSTED (1<<12)
>
>-#define MI_FLUSH_DW MI_INSTR(0x26, 1)
>+#define MI_FLUSH_DW MI_INSTR(0x26, 0)
> #define MI_FLUSH_DW_STORE_INDEX (1<<21)
> #define MI_INVALIDATE_TLB (1<<18)
> #define MI_FLUSH_DW_CCS (1<<16)
> #define MI_FLUSH_DW_OP_STOREDW (1<<14)
> #define MI_FLUSH_DW_USE_GTT (1<<2)
>+#define MI_FLUSH_LENGTH GENMASK(5, 0)
>+#define MI_FLUSH_IMM_DW REG_FIELD_PREP(MI_FLUSH_LENGTH, 2)
>+#define MI_FLUSH_IMM_QW REG_FIELD_PREP(MI_FLUSH_LENGTH, 3)
>
> #define MI_BATCH_BUFFER_START MI_INSTR(0x31, 1)
>
>diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
>index 6eec7c7e4bc5..b95cc7713ff9 100644
>--- a/drivers/gpu/drm/xe/xe_ring_ops.c
>+++ b/drivers/gpu/drm/xe/xe_ring_ops.c
>@@ -80,7 +80,7 @@ static int emit_store_imm_ggtt(u32 addr, u32 value, u32 *dw, int i)
> static int emit_flush_imm_ggtt(u32 addr, u32 value, bool invalidate_tlb,
> u32 *dw, int i)
> {
>- dw[i++] = (MI_FLUSH_DW + 1) | MI_FLUSH_DW_OP_STOREDW |
>+ dw[i++] = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_IMM_DW |
> (invalidate_tlb ? MI_INVALIDATE_TLB : 0);
> dw[i++] = addr | MI_FLUSH_DW_USE_GTT;
> dw[i++] = 0;
>@@ -100,9 +100,9 @@ static int emit_bb_start(u64 batch_addr, u32 ppgtt_flag, u32 *dw, int i)
>
> static int emit_flush_invalidate(u32 flag, u32 *dw, int i)
> {
>- dw[i] = MI_FLUSH_DW + 1;
>+ dw[i] = MI_FLUSH_DW;
> dw[i] |= flag;
>- dw[i++] |= MI_INVALIDATE_TLB | MI_FLUSH_DW_OP_STOREDW |
>+ dw[i++] |= MI_INVALIDATE_TLB | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_IMM_DW |
I don't understand the logic/convention you used to add the MI_FLUSH_IMM_DW here
rather than a couple lines above or as the last item to account for the
bit position.
Anyway, it seems correct.
Reviewed-by: Lucas De Marchi <lucas.demarchi at intel.com>
Lucas De Marchi
> MI_FLUSH_DW_STORE_INDEX;
>
> dw[i++] = LRC_PPHWSP_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT;
>@@ -365,8 +365,8 @@ static void emit_migration_job_gen12(struct xe_sched_job *job,
>
> i = emit_bb_start(job->batch_addr[1], BIT(8), dw, i);
>
>- dw[i++] = (MI_FLUSH_DW | MI_INVALIDATE_TLB | job->migrate_flush_flags |
>- MI_FLUSH_DW_OP_STOREDW) + 1;
>+ dw[i++] = MI_FLUSH_DW | MI_INVALIDATE_TLB | job->migrate_flush_flags |
>+ MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_IMM_DW;
> dw[i++] = xe_lrc_seqno_ggtt_addr(lrc) | MI_FLUSH_DW_USE_GTT;
> dw[i++] = 0;
> dw[i++] = seqno; /* value */
>--
>2.41.0
>
More information about the Intel-xe
mailing list