[PATCH V2] drm/xe/xe2: Add Wa_15015404425
Lucas De Marchi
lucas.demarchi at intel.com
Tue Jul 9 17:46:35 UTC 2024
On Wed, Jul 03, 2024 at 04:23:15PM GMT, Matt Roper wrote:
>On Wed, Jul 03, 2024 at 12:16:26PM +0530, Tejas Upadhyay wrote:
>> Wa_15015404425 applies to xe2 LNL all steppings
>
>Since this isn't a typical "write XX to register YY" workaround, we
>should have a brief explanation of what the workaround is asking us to
>do here for context. E.g., something along the lines of
>
> "Wa_15015404425 asks us to perform four "dummy" writes to a
> non-existent register offset before every real register read.
> Although the specific offset of the writes doesn't directly
> matter, the workaround suggests offset 0x130030 as a good target
> so that these writes will be easy to recognize and filter out in
> debugging traces."
>
>>
>> V2:
>> - Add WA to 8/16/32bit reads also - MattR
>> - Corrected dummy reg address - MattR
>> - Use for loop to avoid mental pause - JaniN
>>
>> Signed-off-by: Tejas Upadhyay <tejas.upadhyay at intel.com>
>> ---
>> drivers/gpu/drm/xe/regs/xe_regs.h | 1 +
>> drivers/gpu/drm/xe/xe_mmio.c | 22 ++++++++++++++++++++++
>> 2 files changed, 23 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/regs/xe_regs.h b/drivers/gpu/drm/xe/regs/xe_regs.h
>> index 23e33ec84902..56a1ba59cbd3 100644
>> --- a/drivers/gpu/drm/xe/regs/xe_regs.h
>> +++ b/drivers/gpu/drm/xe/regs/xe_regs.h
>> @@ -45,6 +45,7 @@
>> #define STOLEN_RESERVED XE_REG(0x1082c0)
>> #define WOPCM_SIZE_MASK REG_GENMASK64(9, 7)
>>
>> +#define MEDIA_DUMMY_REG XE_REG(0x130030)
>
>Since this workaround wound up not actually being media-related at all,
>does the "MEDIA_" prefix here make sense? This register isn't actually
>in the media IP (according to bspec 60110 it's just in the general
>sgunit register range). Honestly, since there's no real register at
>this offset in the hardware, this might be a case where we don't even
>want/need a definition in xe_regs.h (see below).
>
>> #define MTL_RP_STATE_CAP XE_REG(0x138000)
>>
>> #define MTL_GT_RPE_FREQUENCY XE_REG(0x13800c)
>> diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
>> index f92faad4b96d..51022a560086 100644
>> --- a/drivers/gpu/drm/xe/xe_mmio.c
>> +++ b/drivers/gpu/drm/xe/xe_mmio.c
>> @@ -121,12 +121,28 @@ int xe_mmio_init(struct xe_device *xe)
>> return devm_add_action_or_reset(xe->drm.dev, mmio_fini, xe);
>> }
>>
>> +static void mmio_flush_pending_writes(struct xe_gt *gt)
>> +{
>> + struct xe_device *xe = gt_to_xe(gt);
>> + int i;
>> +
>> + if (!(xe->info.platform == XE_LUNARLAKE))
>> + return;
>
>This works for now, although it may cause issues down the road if we
>wind up with refresh/derivative platforms that don't need this
>workaround. I think at some point soon we're going to need to separate
>out SoC logic into its own dedicated sub-driver (similar to what's
>happening with display), and bake a dedicated workaround framework into
>that SoC sub-driver to handle cases like this. Although we'll also need
>to work with our internal hardware people to improve how workarounds
>like this get documented so that it's more obvious up front which ones
>are tied to the graphics/media/display IP, and which ones are tied to
>the SoC revisions that the IP is integrated into.
regardless, why is this not using XE_WA() with the rule in
xe_wa_oob.rules being XE_LUNARLAKE?
>
>> +
>> + /* 4 dummy writes */
>> + for (i = 0; i < 4; i++)
>> + xe_mmio_write32(gt, MEDIA_DUMMY_REG, 0);
>
>This might be a spot where it would actually make sense to do some
>direct writel()'s rather than using xe_mmio_write32. That would bypass
>the trace_xe_reg_rw(); I'm making the assumption that most people doing
>MMIO tracing wouldn't be interested in having these constant dummy
>writes cluttering up the trace output (even though the known offset does
>make them somewhat easier to filter out). If we do that, we could move
>the offset #define down to this file rather than defining a true XE_REG()
>in the register header.
then we have N places hiding reads/writes that's difficult to reason
about. On the other hand, if the trace is there, people can simply
filter out by the offset. Remember the issue in i915/display about
unclaimed registers being caused by other reads/writes and not the one
in the report?
4b276ed3c7ac ("drm/i915/uncore: Warn on previous unclaimed accesses")
618f5df1f6a5 ("drm/i915/uncore: Warn only if unclaimed access remains flagged")
IMO even if we replace xe_mmio_write32() with writel() we shouldn't hide
it from traces.
Lucas De Marchi
>
>
>Matt
>
>> +}
>> +
>> u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg)
>> {
>> struct xe_tile *tile = gt_to_tile(gt);
>> u32 addr = xe_mmio_adjusted_addr(gt, reg.addr);
>> u8 val;
>>
>> + /* Wa_15015404425 */
>> + mmio_flush_pending_writes(gt);
>> +
>> val = readb((reg.ext ? tile->mmio_ext.regs : tile->mmio.regs) + addr);
>> trace_xe_reg_rw(gt, false, addr, val, sizeof(val));
>>
>> @@ -139,6 +155,9 @@ u16 xe_mmio_read16(struct xe_gt *gt, struct xe_reg reg)
>> u32 addr = xe_mmio_adjusted_addr(gt, reg.addr);
>> u16 val;
>>
>> + /* Wa_15015404425 */
>> + mmio_flush_pending_writes(gt);
>> +
>> val = readw((reg.ext ? tile->mmio_ext.regs : tile->mmio.regs) + addr);
>> trace_xe_reg_rw(gt, false, addr, val, sizeof(val));
>>
>> @@ -160,6 +179,9 @@ u32 xe_mmio_read32(struct xe_gt *gt, struct xe_reg reg)
>> u32 addr = xe_mmio_adjusted_addr(gt, reg.addr);
>> u32 val;
>>
>> + /* Wa_15015404425 */
>> + mmio_flush_pending_writes(gt);
>> +
>> if (!reg.vf && IS_SRIOV_VF(gt_to_xe(gt)))
>> val = xe_gt_sriov_vf_read32(gt, reg);
>> else
>> --
>> 2.25.1
>>
>
>--
>Matt Roper
>Graphics Software Engineer
>Linux GPU Platform Enablement
>Intel Corporation
More information about the Intel-xe
mailing list