[PATCH] drm/xe/xe2: Extend performance tuning to media GT
Gustavo Sousa
gustavo.sousa at intel.com
Tue Sep 17 17:02:38 UTC 2024
Quoting Gustavo Sousa (2024-09-17 13:53:54-03:00)
>With exception of "Tuning: L3 cache - media", we are currently applying
>recommended performance tuning settings only for the primary GT. Let's
>also apply them to the media GT when applicable.
>
>According to our spec, media GT registers CCCHKNREG1 and L3SQCREG* exist
>only in Xe2_LPM and their offsets do not match their primary GT
>counterparts. As such, we need to have Xe2_LPM-specific definitions for
>them and apply the setting only for that specific IP.
>
>Both Xe2_HPM and Xe2_LPM contain STATELESS_COMPRESSION_CTRL and the
>offset on the media GT matches the one on the primary one, so we can use
>the common definition and apply the setting to both IPs.
>
>Bspec: 72161
>Signed-off-by: Gustavo Sousa <gustavo.sousa at intel.com>
>---
> drivers/gpu/drm/xe/regs/xe_gt_regs.h | 6 ++++++
> drivers/gpu/drm/xe/xe_tuning.c | 19 +++++++++++++++++++
> 2 files changed, 25 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>index cf21de3adca6..2e655291a84a 100644
>--- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>+++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>@@ -169,6 +169,8 @@
> #define XEHP_SLICE_COMMON_ECO_CHICKEN1 XE_REG_MCR(0x731c, XE_REG_OPTION_MASKED)
> #define MSC_MSAA_REODER_BUF_BYPASS_DISABLE REG_BIT(14)
>
>+#define XE2LPM_CCCHKNREG1 XE_REG_MCR(0x82a8)
>+
> #define VF_PREEMPTION XE_REG(0x83a4, XE_REG_OPTION_MASKED)
> #define PREEMPTION_VERTEX_COUNT REG_GENMASK(15, 0)
>
>@@ -399,6 +401,10 @@
> #define SCRATCH1LPFC XE_REG(0xb474)
> #define EN_L3_RW_CCS_CACHE_FLUSH REG_BIT(0)
>
>+#define XE2LPM_L3SQCREG2 XE_REG_MCR(0xb604)
>+
>+#define XE2LPM_L3SQCREG3 XE_REG_MCR(0xb608)
>+
> #define XE2LPM_L3SQCREG5 XE_REG_MCR(0xb658)
>
> #define XE2_TDF_CTRL XE_REG(0xb418)
>diff --git a/drivers/gpu/drm/xe/xe_tuning.c b/drivers/gpu/drm/xe/xe_tuning.c
>index faa1bf42e50e..ea1444358b4f 100644
>--- a/drivers/gpu/drm/xe/xe_tuning.c
>+++ b/drivers/gpu/drm/xe/xe_tuning.c
>@@ -42,20 +42,39 @@ static const struct xe_rtp_entry_sr gt_tunings[] = {
> XE_RTP_ACTIONS(CLR(CCCHKNREG1, ENCOMPPERFFIX),
> SET(CCCHKNREG1, L3CMPCTRL))
> },
>+ { XE_RTP_NAME("Tuning: Compression Overfetch - media"),
>+ XE_RTP_RULES(MEDIA_VERSION(2000)),
+Matt
I used exact match on the media version here because that's what is
already used for "Tuning: L3 cache - media", but I wonder if we should
make it MEDIA_VERSION_RANGE(2000, XE_RTP_END_VERSION_UNDEFINED),
similarly to what is done for the primary GT.
>+ XE_RTP_ACTIONS(CLR(XE2LPM_CCCHKNREG1, ENCOMPPERFFIX),
>+ SET(CCCHKNREG1, L3CMPCTRL))
>+ },
> { XE_RTP_NAME("Tuning: Enable compressible partial write overfetch in L3"),
> XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, XE_RTP_END_VERSION_UNDEFINED)),
> XE_RTP_ACTIONS(SET(L3SQCREG3, COMPPWOVERFETCHEN))
> },
>+ { XE_RTP_NAME("Tuning: Enable compressible partial write overfetch in L3 - media"),
>+ XE_RTP_RULES(MEDIA_VERSION(2000)),
>+ XE_RTP_ACTIONS(SET(XE2LPM_L3SQCREG3, COMPPWOVERFETCHEN))
>+ },
> { XE_RTP_NAME("Tuning: L2 Overfetch Compressible Only"),
> XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, XE_RTP_END_VERSION_UNDEFINED)),
> XE_RTP_ACTIONS(SET(L3SQCREG2,
> COMPMEMRD256BOVRFETCHEN))
> },
>+ { XE_RTP_NAME("Tuning: L2 Overfetch Compressible Only - media"),
>+ XE_RTP_RULES(MEDIA_VERSION(2000)),
>+ XE_RTP_ACTIONS(SET(XE2LPM_L3SQCREG2,
>+ COMPMEMRD256BOVRFETCHEN))
>+ },
> { XE_RTP_NAME("Tuning: Stateless compression control"),
> XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, XE_RTP_END_VERSION_UNDEFINED)),
> XE_RTP_ACTIONS(FIELD_SET(STATELESS_COMPRESSION_CTRL, UNIFIED_COMPRESSION_FORMAT,
> REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0)))
> },
>+ { XE_RTP_NAME("Tuning: Stateless compression control - media"),
>+ XE_RTP_RULES(MEDIA_VERSION_RANGE(1301, 2000)),
Also in this case, where we are already using a closed interval.
--
Gustavo Sousa
>+ XE_RTP_ACTIONS(FIELD_SET(STATELESS_COMPRESSION_CTRL, UNIFIED_COMPRESSION_FORMAT,
>+ REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0)))
>+ },
> {}
> };
>
>--
>2.46.1
>
More information about the Intel-xe
mailing list