[PATCH V2] drm/xe/xe2: Add Wa_15015404425

Lucas De Marchi lucas.demarchi at intel.com
Tue Jul 9 17:46:35 UTC 2024


On Wed, Jul 03, 2024 at 04:23:15PM GMT, Matt Roper wrote:
>On Wed, Jul 03, 2024 at 12:16:26PM +0530, Tejas Upadhyay wrote:
>> Wa_15015404425 applies to xe2 LNL all steppings
>
>Since this isn't a typical "write XX to register YY" workaround, we
>should have a brief explanation of what the workaround is asking us to
>do here for context.  E.g., something along the lines of
>
>        "Wa_15015404425 asks us to perform four "dummy" writes to a
>        non-existent register offset before every real register read.
>        Although the specific offset of the writes doesn't directly
>        matter, the workaround suggests offset 0x130030 as a good target
>        so that these writes will be easy to recognize and filter out in
>        debugging traces."
>
>>
>> V2:
>>   - Add WA to 8/16/32bit reads also - MattR
>>   - Corrected dummy reg address - MattR
>>   - Use for loop to avoid mental pause - JaniN
>>
>> Signed-off-by: Tejas Upadhyay <tejas.upadhyay at intel.com>
>> ---
>>  drivers/gpu/drm/xe/regs/xe_regs.h |  1 +
>>  drivers/gpu/drm/xe/xe_mmio.c      | 22 ++++++++++++++++++++++
>>  2 files changed, 23 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/regs/xe_regs.h b/drivers/gpu/drm/xe/regs/xe_regs.h
>> index 23e33ec84902..56a1ba59cbd3 100644
>> --- a/drivers/gpu/drm/xe/regs/xe_regs.h
>> +++ b/drivers/gpu/drm/xe/regs/xe_regs.h
>> @@ -45,6 +45,7 @@
>>  #define STOLEN_RESERVED				XE_REG(0x1082c0)
>>  #define   WOPCM_SIZE_MASK			REG_GENMASK64(9, 7)
>>
>> +#define MEDIA_DUMMY_REG				XE_REG(0x130030)
>
>Since this workaround wound up not actually being media-related at all,
>does the "MEDIA_" prefix here make sense?  This register isn't actually
>in the media IP (according to bspec 60110 it's just in the general
>sgunit register range).  Honestly, since there's no real register at
>this offset in the hardware, this might be a case where we don't even
>want/need a definition in xe_regs.h (see below).
>
>>  #define MTL_RP_STATE_CAP			XE_REG(0x138000)
>>
>>  #define MTL_GT_RPE_FREQUENCY			XE_REG(0x13800c)
>> diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
>> index f92faad4b96d..51022a560086 100644
>> --- a/drivers/gpu/drm/xe/xe_mmio.c
>> +++ b/drivers/gpu/drm/xe/xe_mmio.c
>> @@ -121,12 +121,28 @@ int xe_mmio_init(struct xe_device *xe)
>>  	return devm_add_action_or_reset(xe->drm.dev, mmio_fini, xe);
>>  }
>>
>> +static void mmio_flush_pending_writes(struct xe_gt *gt)
>> +{
>> +	struct xe_device *xe = gt_to_xe(gt);
>> +	int i;
>> +
>> +	if (!(xe->info.platform == XE_LUNARLAKE))
>> +		return;
>
>This works for now, although it may cause issues down the road if we
>wind up with refresh/derivative platforms that don't need this
>workaround.  I think at some point soon we're going to need to separate
>out SoC logic into its own dedicated sub-driver (similar to what's
>happening with display), and bake a dedicated workaround framework into
>that SoC sub-driver to handle cases like this.  Although we'll also need
>to work with our internal hardware people to improve how workarounds
>like this get documented so that it's more obvious up front which ones
>are tied to the graphics/media/display IP, and which ones are tied to
>the SoC revisions that the IP is integrated into.

regardless, why is this not using XE_WA() with the rule in
xe_wa_oob.rules being XE_LUNARLAKE?


>
>> +
>> +	/* 4 dummy writes */
>> +	for (i = 0; i < 4; i++)
>> +		xe_mmio_write32(gt, MEDIA_DUMMY_REG, 0);
>
>This might be a spot where it would actually make sense to do some
>direct writel()'s rather than using xe_mmio_write32.  That would bypass
>the trace_xe_reg_rw(); I'm making the assumption that most people doing
>MMIO tracing wouldn't be interested in having these constant dummy
>writes cluttering up the trace output (even though the known offset does
>make them somewhat easier to filter out).  If we do that, we could move
>the offset #define down to this file rather than defining a true XE_REG()
>in the register header.

then we have N places hiding reads/writes that's difficult to reason
about. On the other hand, if the trace is there, people can simply
filter out by the offset.  Remember the issue in i915/display about
unclaimed registers being caused by other reads/writes and not the one
in the report?

4b276ed3c7ac ("drm/i915/uncore: Warn on previous unclaimed accesses")
618f5df1f6a5 ("drm/i915/uncore: Warn only if unclaimed access remains flagged")

IMO even if we replace xe_mmio_write32() with writel() we shouldn't hide
it from traces.

Lucas De Marchi

>
>
>Matt
>
>> +}
>> +
>>  u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg)
>>  {
>>  	struct xe_tile *tile = gt_to_tile(gt);
>>  	u32 addr = xe_mmio_adjusted_addr(gt, reg.addr);
>>  	u8 val;
>>
>> +	/* Wa_15015404425 */
>> +	mmio_flush_pending_writes(gt);
>> +
>>  	val = readb((reg.ext ? tile->mmio_ext.regs : tile->mmio.regs) + addr);
>>  	trace_xe_reg_rw(gt, false, addr, val, sizeof(val));
>>
>> @@ -139,6 +155,9 @@ u16 xe_mmio_read16(struct xe_gt *gt, struct xe_reg reg)
>>  	u32 addr = xe_mmio_adjusted_addr(gt, reg.addr);
>>  	u16 val;
>>
>> +	/* Wa_15015404425 */
>> +	mmio_flush_pending_writes(gt);
>> +
>>  	val = readw((reg.ext ? tile->mmio_ext.regs : tile->mmio.regs) + addr);
>>  	trace_xe_reg_rw(gt, false, addr, val, sizeof(val));
>>
>> @@ -160,6 +179,9 @@ u32 xe_mmio_read32(struct xe_gt *gt, struct xe_reg reg)
>>  	u32 addr = xe_mmio_adjusted_addr(gt, reg.addr);
>>  	u32 val;
>>
>> +	/* Wa_15015404425 */
>> +	mmio_flush_pending_writes(gt);
>> +
>>  	if (!reg.vf && IS_SRIOV_VF(gt_to_xe(gt)))
>>  		val = xe_gt_sriov_vf_read32(gt, reg);
>>  	else
>> --
>> 2.25.1
>>
>
>-- 
>Matt Roper
>Graphics Software Engineer
>Linux GPU Platform Enablement
>Intel Corporation


More information about the Intel-xe mailing list