[PATCH 1/1] drm/xe/xe2: Introduce performance changes
Matt Roper
matthew.d.roper at intel.com
Mon Jul 29 19:30:31 UTC 2024
On Mon, Jul 29, 2024 at 05:25:47AM -0700, Upadhyay, Tejas wrote:
>
>
> > -----Original Message-----
> > From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of Akshata
> > Jahagirdar
> > Sent: Saturday, July 27, 2024 11:50 AM
> > To: intel-xe at lists.freedesktop.org
> > Cc: Roper, Matthew D <matthew.d.roper at intel.com>; Jahagirdar, Akshata
> > <akshata.jahagirdar at intel.com>
> > Subject: [PATCH 1/1] drm/xe/xe2: Introduce performance changes
> >
> > Add Compression Performance Improvement Changes in Xe2
> >
> > Bspec: 70559, 59928, 59927, 59251
The bspec line should generally be down with the other trailers (e.g.,
immediately before your s-o-b trailer), not floating up here on its own.
Also, the main bspec reference we really want on a patch like this is
page 72161, since that's where the official performance settings are
documented and that's the reference we use for driver changes.
> >
> > v2: Rebase
> >
> > Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar at intel.com>
> > ---
> > drivers/gpu/drm/xe/regs/xe_gt_regs.h | 10 ++++++++++
> > drivers/gpu/drm/xe/xe_tuning.c | 13 +++++++++++++
> > 2 files changed, 23 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > index 8a94a94d2267..28f13362a3df 100644
> > --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> > @@ -80,6 +80,9 @@
> > #define LE_CACHEABILITY_MASK REG_GENMASK(1, 0)
> > #define LE_CACHEABILITY(value)
> > REG_FIELD_PREP(LE_CACHEABILITY_MASK, value)
> >
> > +#define STATELESS_COMPRESSION_CTRL
> > XE_REG_MCR(0x4148)
> > +#define UNIFIED_COMPRESSION_FORMAT REG_GENMASK(3, 0)
> > +
> > #define XE2_GAMREQSTRM_CTRL XE_REG(0x4194)
> > #define CG_DIS_CNTLBUS REG_BIT(6)
> >
> > @@ -192,6 +195,7 @@
> > #define GSCPSMI_BASE XE_REG(0x880c)
> >
> > #define CCCHKNREG1 XE_REG_MCR(0x8828)
> > +#define COMPOVFDIS REG_BIT(25)
>
> Is this present on BMG? I could not see. Please help if I am missing something.
>
> > #define ENCOMPPERFFIX REG_BIT(18)
> >
> > /* Fuse readout registers for GT */
> > @@ -366,6 +370,12 @@
> > #define XEHP_L3NODEARBCFG XE_REG_MCR(0xb0b4)
> > #define XEHP_LNESPARE REG_BIT(19)
> >
> > +#define L3SQCREG2 XE_REG_MCR(0xb104)
> > +#define MEMRDNONCOMPOVRFETCHDIS REG_BIT(26)
> > +#define COMPMEMRD256BOVRFETCHEN REG_BIT(20)
> > +#define MEMRDOVRFETCHDIS REG_BIT(17)
> > +#define MEMRD256BOVRFETCHEN REG_BIT(14)
> > +
>
> Please check if indentation is correct here.
As Tejas noted, the indication is off here, and also in a few other
places. Defines of register names have a single space between the
"#define" and the register name. Defines of fields/bits inside a
register should have two extra (three total) spaces between the
"#define" and the field name.
>
> > #define L3SQCREG3 XE_REG_MCR(0xb108)
> > #define COMPPWOVERFETCHEN REG_BIT(28)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_tuning.c b/drivers/gpu/drm/xe/xe_tuning.c
> > index d4e6fa918942..f338be24f859 100644
> > --- a/drivers/gpu/drm/xe/xe_tuning.c
> > +++ b/drivers/gpu/drm/xe/xe_tuning.c
> > @@ -45,6 +45,19 @@ static const struct xe_rtp_entry_sr gt_tunings[] = {
> > XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001,
> > XE_RTP_END_VERSION_UNDEFINED)),
> > XE_RTP_ACTIONS(SET(L3SQCREG3, COMPPWOVERFETCHEN))
> > },
> > + { XE_RTP_NAME("Tuning: Change Stateless Compression Control
> > register default to R8 (0x0)"),
> > + XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001,
> > XE_RTP_END_VERSION_UNDEFINED)),
> > + XE_RTP_ACTIONS(CLR(STATELESS_COMPRESSION_CTRL,
> > UNIFIED_COMPRESSION_FORMAT))
> > + },
This setting isn't documented in the bspec. We should hold off on this
one until/unless it becomes part of the documented recommendations in
the spec. You might want to ping the internal ticket where these
settings were first mentioned, to see whether this setting was
intentionally skipped in the bspec update that just landed, or whether
this was an oversight.
> > + { XE_RTP_NAME("Tuning: L2 Overfetch Compressible Only"),
> > + XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001,
> > XE_RTP_END_VERSION_UNDEFINED)),
> > + XE_RTP_ACTIONS(SET(L3SQCREG2,
> > COMPMEMRD256BOVRFETCHEN),
> > + CLR(L3SQCREG2,
> > + MEMRD256BOVRFETCHEN |
> > + MEMRDOVRFETCHDIS |
> > + MEMRDNONCOMPOVRFETCHDIS),
We generally don't bother (re-)setting bits that already have the proper
hardware default value. In this case, bits 17 and 26 are already
initialized to 0 by the hardware, so it isn't really necessary to
forcefully clear them again.
> > + CLR(CCCHKNREG1, COMPOVFDIS))
Same here; this bit is already 0 so we don't really need to reprogram
it.
Matt
> > + },
> > {}
> > };
> >
> > --
> > 2.34.1
>
--
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation
More information about the Intel-xe
mailing list