[PATCH 1/3] drm/xe/xe2: Extend performance tuning to media GT

Gustavo Sousa gustavo.sousa at intel.com
Thu Sep 19 18:08:50 UTC 2024


Quoting Upadhyay, Tejas (2024-09-19 05:00:22-03:00)
>
>
>> -----Original Message-----
>> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of Gustavo
>> Sousa
>> Sent: Thursday, September 19, 2024 2:17 AM
>> To: intel-xe at lists.freedesktop.org
>> Cc: Roper, Matthew D <matthew.d.roper at intel.com>
>> Subject: [PATCH 1/3] drm/xe/xe2: Extend performance tuning to media GT
>> 
>> With exception of "Tuning: L3 cache - media", we are currently applying
>> recommended performance tuning settings only for the primary GT. Let's also
>> implement them for the media GT when applicable.
>> 
>> According to our spec, media GT registers CCCHKNREG1 and L3SQCREG* exist
>> only in Xe2_LPM and their offsets do not match their primary GT
>> counterparts. Furthermore, the range where CCCHKNREG1 belongs is not
>> listed as a multicast range on the media GT. As such, we need to have
>> Xe2_LPM-specific definitions for those registers and apply the setting only for
>> that specific IP.
>> 
>> Both Xe2_HPM and Xe2_LPM contain STATELESS_COMPRESSION_CTRL and
>> the offset on the media GT matches the one on the primary one. However,
>> the range that contains that register is not is not listed as a multicast range, so
>> we need two different entries for media.
>> 
>> v2:
>>   - Fix implementation with respect to multicast vs non-multicast
>>     registers. (Matt)
>>   - Add missing XE2LPM_CCCHKNREG1 on second action of "Tuning:
>>     Compression Overfetch - media".
>> 
>> Bspec: 72161
>> Cc: Matt Roper <matthew.d.roper at intel.com>
>> Signed-off-by: Gustavo Sousa <gustavo.sousa at intel.com>
>> ---
>>  drivers/gpu/drm/xe/regs/xe_gt_regs.h |  7 +++++++
>>  drivers/gpu/drm/xe/xe_tuning.c       | 24 ++++++++++++++++++++++++
>>  2 files changed, 31 insertions(+)
>> 
>> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> index cf21de3adca6..6ec2d2c11d77 100644
>> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> @@ -80,6 +80,7 @@
>>  #define   LE_CACHEABILITY_MASK                        REG_GENMASK(1, 0)
>>  #define   LE_CACHEABILITY(value)
>>         REG_FIELD_PREP(LE_CACHEABILITY_MASK, value)
>> 
>> +#define XELPMP_STATELESS_COMPRESSION_CTRL        XE_REG(0x4148)
>
>Were trying to say, XE2LPM_ here? Also this seems to be MCR register.

Yeah, you're right on both. I was looking at steering spec for MTL media
instead of BMG's when adding this and then used XELPMP_ thinking that
Xe_LMP+ also had that register.

Thanks for catching this. I'll update this on the next version of this
series.

It looks like we also need to fix the logic around MCR tables in our
driver, since we are selecting Xe_LPM+'s table for Xe2_LPM.

>
>>  #define STATELESS_COMPRESSION_CTRL
>>         XE_REG_MCR(0x4148)
>>  #define   UNIFIED_COMPRESSION_FORMAT                REG_GENMASK(3, 0)
>> 
>> @@ -169,6 +170,8 @@
>>  #define XEHP_SLICE_COMMON_ECO_CHICKEN1
>>         XE_REG_MCR(0x731c, XE_REG_OPTION_MASKED)
>>  #define   MSC_MSAA_REODER_BUF_BYPASS_DISABLE        REG_BIT(14)
>> 
>> +#define XE2LPM_CCCHKNREG1                        XE_REG(0x82a8)
>> +
>>  #define VF_PREEMPTION                                XE_REG(0x83a4,
>> XE_REG_OPTION_MASKED)
>>  #define   PREEMPTION_VERTEX_COUNT                REG_GENMASK(15, 0)
>> 
>> @@ -399,6 +402,10 @@
>>  #define SCRATCH1LPFC                                XE_REG(0xb474)
>>  #define   EN_L3_RW_CCS_CACHE_FLUSH                REG_BIT(0)
>> 
>> +#define XE2LPM_L3SQCREG2                        XE_REG_MCR(0xb604)
>> +
>> +#define XE2LPM_L3SQCREG3                        XE_REG_MCR(0xb608)
>> +
>
>These are not marked MCR in bspec. Is there something I missed looking.

I just checked Bspec 71186 again and range [0x38B600:0x38B8FF] is marked
as multicast.

--
Gustavo Sousa

>
>>  #define XE2LPM_L3SQCREG5                        XE_REG_MCR(0xb658)
>> 
>>  #define XE2_TDF_CTRL                                XE_REG(0xb418)
>> diff --git a/drivers/gpu/drm/xe/xe_tuning.c b/drivers/gpu/drm/xe/xe_tuning.c
>> index faa1bf42e50e..7a5b852af8d7 100644
>> --- a/drivers/gpu/drm/xe/xe_tuning.c
>> +++ b/drivers/gpu/drm/xe/xe_tuning.c
>> @@ -42,20 +42,44 @@ static const struct xe_rtp_entry_sr gt_tunings[] = {
>>            XE_RTP_ACTIONS(CLR(CCCHKNREG1, ENCOMPPERFFIX),
>>                           SET(CCCHKNREG1, L3CMPCTRL))
>>          },
>> +        { XE_RTP_NAME("Tuning: Compression Overfetch - media"),
>> +          XE_RTP_RULES(MEDIA_VERSION(2000)),
>> +          XE_RTP_ACTIONS(CLR(XE2LPM_CCCHKNREG1, ENCOMPPERFFIX),
>> +                         SET(XE2LPM_CCCHKNREG1, L3CMPCTRL))
>> +        },
>>          { XE_RTP_NAME("Tuning: Enable compressible partial write overfetch
>> in L3"),
>>            XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001,
>> XE_RTP_END_VERSION_UNDEFINED)),
>>            XE_RTP_ACTIONS(SET(L3SQCREG3, COMPPWOVERFETCHEN))
>>          },
>> +        { XE_RTP_NAME("Tuning: Enable compressible partial write overfetch
>> in L3 - media"),
>> +          XE_RTP_RULES(MEDIA_VERSION(2000)),
>> +          XE_RTP_ACTIONS(SET(XE2LPM_L3SQCREG3,
>> COMPPWOVERFETCHEN))
>> +        },
>>          { XE_RTP_NAME("Tuning: L2 Overfetch Compressible Only"),
>>            XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001,
>> XE_RTP_END_VERSION_UNDEFINED)),
>>            XE_RTP_ACTIONS(SET(L3SQCREG2,
>>                               COMPMEMRD256BOVRFETCHEN))
>>          },
>> +        { XE_RTP_NAME("Tuning: L2 Overfetch Compressible Only - media"),
>> +          XE_RTP_RULES(MEDIA_VERSION(2000)),
>> +          XE_RTP_ACTIONS(SET(XE2LPM_L3SQCREG2,
>> +                             COMPMEMRD256BOVRFETCHEN))
>> +        },
>>          { XE_RTP_NAME("Tuning: Stateless compression control"),
>>            XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001,
>> XE_RTP_END_VERSION_UNDEFINED)),
>>            XE_RTP_ACTIONS(FIELD_SET(STATELESS_COMPRESSION_CTRL,
>> UNIFIED_COMPRESSION_FORMAT,
>> 
>> REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0)))
>>          },
>> +        { XE_RTP_NAME("Tuning: Stateless compression control - media"),
>> +          XE_RTP_RULES(MEDIA_VERSION(2000)),
>> +          XE_RTP_ACTIONS(FIELD_SET(STATELESS_COMPRESSION_CTRL,
>> UNIFIED_COMPRESSION_FORMAT,
>> +
>> REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0)))
>> +        },
>> +        { XE_RTP_NAME("Tuning: Stateless compression control - media
>> (Xe2_HPM)"),
>> +          XE_RTP_RULES(MEDIA_VERSION(1301)),
>> +
>> XE_RTP_ACTIONS(FIELD_SET(XELPMP_STATELESS_COMPRESSION_CTRL,
>> UNIFIED_COMPRESSION_FORMAT,
>> +
>> REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0)))
>> +        },
>>          {}
>>  };
>> 
>> --
>> 2.46.1
>


More information about the Intel-xe mailing list