[PATCH 3/3] drm/xe/xe2: Add performance tuning for L3 cache flushing

Gustavo Sousa gustavo.sousa at intel.com
Fri Sep 20 11:53:33 UTC 2024


Quoting Upadhyay, Tejas (2024-09-20 02:17:44-03:00)
>
>
>> -----Original Message-----
>> From: Sousa, Gustavo <gustavo.sousa at intel.com>
>> Sent: Friday, September 20, 2024 1:06 AM
>> To: Upadhyay, Tejas <tejas.upadhyay at intel.com>; intel-
>> xe at lists.freedesktop.org
>> Cc: Roper, Matthew D <matthew.d.roper at intel.com>
>> Subject: RE: [PATCH 3/3] drm/xe/xe2: Add performance tuning for L3 cache
>> flushing
>> 
>> Quoting Gustavo Sousa (2024-09-19 16:24:19-03:00)
>> >Quoting Upadhyay, Tejas (2024-09-19 05:22:50-03:00)
>> >>
>> >>
>> >>> -----Original Message-----
>> >>> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of
>> >>> Gustavo Sousa
>> >>> Sent: Thursday, September 19, 2024 2:18 AM
>> >>> To: intel-xe at lists.freedesktop.org
>> >>> Cc: Roper, Matthew D <matthew.d.roper at intel.com>
>> >>> Subject: [PATCH 3/3] drm/xe/xe2: Add performance tuning for L3 cache
>> >>> flushing
>> >>>
>> >>> A recommended performance tuning for LNL related to L3 cache
>> >>> flushing was recently introduced in Bspec. Implement it.
>> >>>
>> >>> Bspec: 70821
>> >>
>> >>Yes bspec needs an update.
>> >
>> >Yep.
>> >
>> >>
>> >>> Signed-off-by: Gustavo Sousa <gustavo.sousa at intel.com>
>> >>> ---
>> >>>  drivers/gpu/drm/xe/regs/xe_gt_regs.h | 5 +++++
>> >>>  drivers/gpu/drm/xe/xe_tuning.c       | 8 ++++++++
>> >>>  2 files changed, 13 insertions(+)
>> >>>
>> >>> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> >>> b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> >>> index 6ec2d2c11d77..ccd18cdd5b50 100644
>> >>> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> >>> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> >>> @@ -389,6 +389,9 @@
>> >>>  #define L3SQCREG3                                XE_REG_MCR(0xb108)
>> >>>  #define   COMPPWOVERFETCHEN                        REG_BIT(28)
>> >>>
>> >>> +#define SCRATCH3LBCF
>> >>>         XE_REG_MCR(0xb154)
>> >>
>> >>Please name this SCRATCH3 only as bspec mentions.
>> >
>> >Looking at our register database, it looks like there are other
>> >"SCRATCH" registers from other units. So wouldn't it be better not to
>> >use plain SCRATCH3 here?
>> >
>> >Probably SCRATCH3_LBCF for more readability...
>
>In that case I don’t have strong resistance here, but media also have scratch reg. How about "SCRATCH3_LBCF_GFX". 

Yes, that's why it is being defined with the XE2LPM_ prefix, just like
with the other media registers that have different offset with respect
to graphics.

As an existing example already in the code, look at the definitions for
L3SQCREG5. I'm following the same pattern there.

--
Gustavo Sousa

>
>> >
>> >>
>> >>> +#define   RWFLUSHALLEN                                REG_BIT(17)
>> >>> +
>> >>>  #define XEHP_L3SQCREG5
>> >>>         XE_REG_MCR(0xb158)
>> >>>  #define   L3_PWM_TIMER_INIT_VAL_MASK                REG_GENMASK(9, 0)
>> >>>
>> >>> @@ -406,6 +409,8 @@
>> >>>
>> >>>  #define XE2LPM_L3SQCREG3                        XE_REG_MCR(0xb608)
>> >>>
>> >>> +#define XE2LPM_SCRATCH3LBCF
>> >>>         XE_REG_MCR(0xb654)
>> >>
>> >>Agree with other review that we should name this MEDIA_SCRATCH3 for
>> future use well. Also this does not look to be MCR reg. Please double check.
>> >
>> >I believe the prefix XE2LPM_ is enough to differentiate this from the
>> >primary GT version of this register. It is common to prefix registers
>> >with the name of the IP that introduce the difference with respect to
>> >other versions of the register.
>> >
>> >Please note that the same pattern is used for other registers that have
>> >different offsets on the media GT.
>> 
>> Ah... I forgot to reply also with respect to the MCR-related comment.
>> Bspec 71186 lists range [0x38B600, 0x38B8FF] as multicast.
>
>Oh I did not realise it is different range page for media, I was looking gfx one. Thanks for pointing out. You can retain my r-o-b with that.

Thanks!

Please, let me know if the r-b also stands with SCRATCH3_LBCF without
the _GFX suffix.

--
Gustavo Sousa

>
>Tejas
>> 
>> --
>> Gustavo Sousa
>> 
>> >
>> >--
>> >Gustavo Sousa
>> >
>> >>
>> >>With that addressed,
>> >>Reviewed-by: Tejas Upadhyay <tejas.upadhyay at intel.com>
>> >>
>> >>> +
>> >>>  #define XE2LPM_L3SQCREG5                        XE_REG_MCR(0xb658)
>> >>>
>> >>>  #define XE2_TDF_CTRL                                XE_REG(0xb418)
>> >>> diff --git a/drivers/gpu/drm/xe/xe_tuning.c
>> >>> b/drivers/gpu/drm/xe/xe_tuning.c index f62622f0be85..4dd77b44ac82
>> >>> 100644
>> >>> --- a/drivers/gpu/drm/xe/xe_tuning.c
>> >>> +++ b/drivers/gpu/drm/xe/xe_tuning.c
>> >>> @@ -80,6 +80,14 @@ static const struct xe_rtp_entry_sr gt_tunings[]
>> >>> = {
>> >>>
>> >>> XE_RTP_ACTIONS(FIELD_SET(XELPMP_STATELESS_COMPRESSION_CTRL,
>> >>> UNIFIED_COMPRESSION_FORMAT,
>> >>>
>> >>> REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0)))
>> >>>          },
>> >>> +        { XE_RTP_NAME("Tuning: L3 RW flush all Cache"),
>> >>> +          XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2004,
>> >>> XE_RTP_END_VERSION_UNDEFINED)),
>> >>> +          XE_RTP_ACTIONS(SET(SCRATCH3, RWFLUSHALLEN))
>> >>> +        },
>> >>> +        { XE_RTP_NAME("Tuning: L3 RW flush all cache - media"),
>> >>> +          XE_RTP_RULES(MEDIA_VERSION_RANGE(2000,
>> >>> XE_RTP_END_VERSION_UNDEFINED)),
>> >>> +          XE_RTP_ACTIONS(SET(XE2LPM_SCRATCH3LBCF, RWFLUSHALLEN))
>> >>> +        },
>> >>>          {}
>> >>>  };
>> >>>
>> >>> --
>> >>> 2.46.1
>> >>


More information about the Intel-xe mailing list