[PATCH] drm/xe/xe2: Extend performance tuning to media GT

Gustavo Sousa gustavo.sousa at intel.com
Tue Sep 17 17:02:38 UTC 2024


Quoting Gustavo Sousa (2024-09-17 13:53:54-03:00)
>With exception of "Tuning: L3 cache - media", we are currently applying
>recommended performance tuning settings only for the primary GT. Let's
>also apply them to the media GT when applicable.
>
>According to our spec, media GT registers CCCHKNREG1 and L3SQCREG* exist
>only in Xe2_LPM and their offsets do not match their primary GT
>counterparts. As such, we need to have Xe2_LPM-specific definitions for
>them and apply the setting only for that specific IP.
>
>Both Xe2_HPM and Xe2_LPM contain STATELESS_COMPRESSION_CTRL and the
>offset on the media GT matches the one on the primary one, so we can use
>the common definition and apply the setting to both IPs.
>
>Bspec: 72161
>Signed-off-by: Gustavo Sousa <gustavo.sousa at intel.com>
>---
> drivers/gpu/drm/xe/regs/xe_gt_regs.h |  6 ++++++
> drivers/gpu/drm/xe/xe_tuning.c       | 19 +++++++++++++++++++
> 2 files changed, 25 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>index cf21de3adca6..2e655291a84a 100644
>--- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>+++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>@@ -169,6 +169,8 @@
> #define XEHP_SLICE_COMMON_ECO_CHICKEN1                XE_REG_MCR(0x731c, XE_REG_OPTION_MASKED)
> #define   MSC_MSAA_REODER_BUF_BYPASS_DISABLE        REG_BIT(14)
> 
>+#define XE2LPM_CCCHKNREG1                        XE_REG_MCR(0x82a8)
>+
> #define VF_PREEMPTION                                XE_REG(0x83a4, XE_REG_OPTION_MASKED)
> #define   PREEMPTION_VERTEX_COUNT                REG_GENMASK(15, 0)
> 
>@@ -399,6 +401,10 @@
> #define SCRATCH1LPFC                                XE_REG(0xb474)
> #define   EN_L3_RW_CCS_CACHE_FLUSH                REG_BIT(0)
> 
>+#define XE2LPM_L3SQCREG2                        XE_REG_MCR(0xb604)
>+
>+#define XE2LPM_L3SQCREG3                        XE_REG_MCR(0xb608)
>+
> #define XE2LPM_L3SQCREG5                        XE_REG_MCR(0xb658)
> 
> #define XE2_TDF_CTRL                                XE_REG(0xb418)
>diff --git a/drivers/gpu/drm/xe/xe_tuning.c b/drivers/gpu/drm/xe/xe_tuning.c
>index faa1bf42e50e..ea1444358b4f 100644
>--- a/drivers/gpu/drm/xe/xe_tuning.c
>+++ b/drivers/gpu/drm/xe/xe_tuning.c
>@@ -42,20 +42,39 @@ static const struct xe_rtp_entry_sr gt_tunings[] = {
>           XE_RTP_ACTIONS(CLR(CCCHKNREG1, ENCOMPPERFFIX),
>                          SET(CCCHKNREG1, L3CMPCTRL))
>         },
>+        { XE_RTP_NAME("Tuning: Compression Overfetch - media"),
>+          XE_RTP_RULES(MEDIA_VERSION(2000)),

+Matt

I used exact match on the media version here because that's what is
already used for "Tuning: L3 cache - media", but I wonder if we should
make it MEDIA_VERSION_RANGE(2000, XE_RTP_END_VERSION_UNDEFINED),
similarly to what is done for the primary GT.

>+          XE_RTP_ACTIONS(CLR(XE2LPM_CCCHKNREG1, ENCOMPPERFFIX),
>+                         SET(CCCHKNREG1, L3CMPCTRL))
>+        },
>         { XE_RTP_NAME("Tuning: Enable compressible partial write overfetch in L3"),
>           XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, XE_RTP_END_VERSION_UNDEFINED)),
>           XE_RTP_ACTIONS(SET(L3SQCREG3, COMPPWOVERFETCHEN))
>         },
>+        { XE_RTP_NAME("Tuning: Enable compressible partial write overfetch in L3 - media"),
>+          XE_RTP_RULES(MEDIA_VERSION(2000)),
>+          XE_RTP_ACTIONS(SET(XE2LPM_L3SQCREG3, COMPPWOVERFETCHEN))
>+        },
>         { XE_RTP_NAME("Tuning: L2 Overfetch Compressible Only"),
>           XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, XE_RTP_END_VERSION_UNDEFINED)),
>           XE_RTP_ACTIONS(SET(L3SQCREG2,
>                              COMPMEMRD256BOVRFETCHEN))
>         },
>+        { XE_RTP_NAME("Tuning: L2 Overfetch Compressible Only - media"),
>+          XE_RTP_RULES(MEDIA_VERSION(2000)),
>+          XE_RTP_ACTIONS(SET(XE2LPM_L3SQCREG2,
>+                             COMPMEMRD256BOVRFETCHEN))
>+        },
>         { XE_RTP_NAME("Tuning: Stateless compression control"),
>           XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, XE_RTP_END_VERSION_UNDEFINED)),
>           XE_RTP_ACTIONS(FIELD_SET(STATELESS_COMPRESSION_CTRL, UNIFIED_COMPRESSION_FORMAT,
>                                    REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0)))
>         },
>+        { XE_RTP_NAME("Tuning: Stateless compression control - media"),
>+          XE_RTP_RULES(MEDIA_VERSION_RANGE(1301, 2000)),

Also in this case, where we are already using a closed interval.

--
Gustavo Sousa

>+          XE_RTP_ACTIONS(FIELD_SET(STATELESS_COMPRESSION_CTRL, UNIFIED_COMPRESSION_FORMAT,
>+                                   REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0)))
>+        },
>         {}
> };
> 
>-- 
>2.46.1
>


More information about the Intel-xe mailing list