[PATCH V2] drm/xe/xe2: Add Wa_15015404425
Matt Roper
matthew.d.roper at intel.com
Wed Jul 3 23:23:15 UTC 2024
On Wed, Jul 03, 2024 at 12:16:26PM +0530, Tejas Upadhyay wrote:
> Wa_15015404425 applies to xe2 LNL all steppings
Since this isn't a typical "write XX to register YY" workaround, we
should have a brief explanation of what the workaround is asking us to
do here for context. E.g., something along the lines of
"Wa_15015404425 asks us to perform four "dummy" writes to a
non-existent register offset before every real register read.
Although the specific offset of the writes doesn't directly
matter, the workaround suggests offset 0x130030 as a good target
so that these writes will be easy to recognize and filter out in
debugging traces."
>
> V2:
> - Add WA to 8/16/32bit reads also - MattR
> - Corrected dummy reg address - MattR
> - Use for loop to avoid mental pause - JaniN
>
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay at intel.com>
> ---
> drivers/gpu/drm/xe/regs/xe_regs.h | 1 +
> drivers/gpu/drm/xe/xe_mmio.c | 22 ++++++++++++++++++++++
> 2 files changed, 23 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/regs/xe_regs.h b/drivers/gpu/drm/xe/regs/xe_regs.h
> index 23e33ec84902..56a1ba59cbd3 100644
> --- a/drivers/gpu/drm/xe/regs/xe_regs.h
> +++ b/drivers/gpu/drm/xe/regs/xe_regs.h
> @@ -45,6 +45,7 @@
> #define STOLEN_RESERVED XE_REG(0x1082c0)
> #define WOPCM_SIZE_MASK REG_GENMASK64(9, 7)
>
> +#define MEDIA_DUMMY_REG XE_REG(0x130030)
Since this workaround wound up not actually being media-related at all,
does the "MEDIA_" prefix here make sense? This register isn't actually
in the media IP (according to bspec 60110 it's just in the general
sgunit register range). Honestly, since there's no real register at
this offset in the hardware, this might be a case where we don't even
want/need a definition in xe_regs.h (see below).
> #define MTL_RP_STATE_CAP XE_REG(0x138000)
>
> #define MTL_GT_RPE_FREQUENCY XE_REG(0x13800c)
> diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
> index f92faad4b96d..51022a560086 100644
> --- a/drivers/gpu/drm/xe/xe_mmio.c
> +++ b/drivers/gpu/drm/xe/xe_mmio.c
> @@ -121,12 +121,28 @@ int xe_mmio_init(struct xe_device *xe)
> return devm_add_action_or_reset(xe->drm.dev, mmio_fini, xe);
> }
>
> +static void mmio_flush_pending_writes(struct xe_gt *gt)
> +{
> + struct xe_device *xe = gt_to_xe(gt);
> + int i;
> +
> + if (!(xe->info.platform == XE_LUNARLAKE))
> + return;
This works for now, although it may cause issues down the road if we
wind up with refresh/derivative platforms that don't need this
workaround. I think at some point soon we're going to need to separate
out SoC logic into its own dedicated sub-driver (similar to what's
happening with display), and bake a dedicated workaround framework into
that SoC sub-driver to handle cases like this. Although we'll also need
to work with our internal hardware people to improve how workarounds
like this get documented so that it's more obvious up front which ones
are tied to the graphics/media/display IP, and which ones are tied to
the SoC revisions that the IP is integrated into.
> +
> + /* 4 dummy writes */
> + for (i = 0; i < 4; i++)
> + xe_mmio_write32(gt, MEDIA_DUMMY_REG, 0);
This might be a spot where it would actually make sense to do some
direct writel()'s rather than using xe_mmio_write32. That would bypass
the trace_xe_reg_rw(); I'm making the assumption that most people doing
MMIO tracing wouldn't be interested in having these constant dummy
writes cluttering up the trace output (even though the known offset does
make them somewhat easier to filter out). If we do that, we could move
the offset #define down to this file rather than defining a true XE_REG()
in the register header.
Matt
> +}
> +
> u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg)
> {
> struct xe_tile *tile = gt_to_tile(gt);
> u32 addr = xe_mmio_adjusted_addr(gt, reg.addr);
> u8 val;
>
> + /* Wa_15015404425 */
> + mmio_flush_pending_writes(gt);
> +
> val = readb((reg.ext ? tile->mmio_ext.regs : tile->mmio.regs) + addr);
> trace_xe_reg_rw(gt, false, addr, val, sizeof(val));
>
> @@ -139,6 +155,9 @@ u16 xe_mmio_read16(struct xe_gt *gt, struct xe_reg reg)
> u32 addr = xe_mmio_adjusted_addr(gt, reg.addr);
> u16 val;
>
> + /* Wa_15015404425 */
> + mmio_flush_pending_writes(gt);
> +
> val = readw((reg.ext ? tile->mmio_ext.regs : tile->mmio.regs) + addr);
> trace_xe_reg_rw(gt, false, addr, val, sizeof(val));
>
> @@ -160,6 +179,9 @@ u32 xe_mmio_read32(struct xe_gt *gt, struct xe_reg reg)
> u32 addr = xe_mmio_adjusted_addr(gt, reg.addr);
> u32 val;
>
> + /* Wa_15015404425 */
> + mmio_flush_pending_writes(gt);
> +
> if (!reg.vf && IS_SRIOV_VF(gt_to_xe(gt)))
> val = xe_gt_sriov_vf_read32(gt, reg);
> else
> --
> 2.25.1
>
--
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation
More information about the Intel-xe
mailing list