[Intel-xe] [PATCH 2/2 v3] drm/xe: add gt tuning for indirect state

Matt Roper matthew.d.roper at intel.com
Thu Aug 24 20:35:55 UTC 2023


On Wed, Aug 23, 2023 at 12:55:33PM -0700, Matt Atwood wrote:
> Force indirect state sampler data to only be in the dynamic state pool,
> which is more convienent for the UMD. Behavior change mirrors similar
> change for i915 in commit 16fc9c08f0ec ("drm/i915: disable sampler
> indirect state in bindless heap")
> 
> v2: split out per engine tuning into separate patch, commit message
> (Lucas)
> v3: rebase
> 
> Bspec: 46052

These days there's no realy need to put bspec references for register
pages on workarounds like this.  Reviewers with bspec access can already
lookup the register directly by name and/or offset, so this isn't
helpful like it used to be in the old days.

> 
> Signed-off-by: Matt Atwood <matthew.s.atwood at intel.com>
> ---
>  drivers/gpu/drm/xe/regs/xe_gt_regs.h | 1 +
>  drivers/gpu/drm/xe/xe_tuning.c       | 5 +++++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> index aa9d7fad41ee..d039e7afe466 100644
> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
> @@ -298,6 +298,7 @@
>  #define   ENABLE_SMALLPL			REG_BIT(15)
>  #define   SC_DISABLE_POWER_OPTIMIZATION_EBB	REG_BIT(9)
>  #define   SAMPLER_ENABLE_HEADLESS_MSG		REG_BIT(5)
> +#define   INDIRECT_STATE_BASE_ADDR_OVERRIDE	REG_BIT(0)
>  
>  #define HALF_SLICE_CHICKEN7				XE_REG_MCR(0xe194, XE_REG_OPTION_MASKED)
>  #define   DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA	REG_BIT(15)
> diff --git a/drivers/gpu/drm/xe/xe_tuning.c b/drivers/gpu/drm/xe/xe_tuning.c
> index 702cb41dab53..07ffda39e2e4 100644
> --- a/drivers/gpu/drm/xe/xe_tuning.c
> +++ b/drivers/gpu/drm/xe/xe_tuning.c
> @@ -28,6 +28,11 @@ static const struct xe_rtp_entry_sr gt_tunings[] = {
>  };
>  
>  static const struct xe_rtp_entry_sr engine_tunings[] = {
> +	{ XE_RTP_NAME("Tuning: Set Indirect State Override"),
> +	  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(1200, XE_RTP_END_VERSION_UNDEFINED),

This matches every single platform, so this rule isn't really doing any
good as is.

However this setting is already the hardware default on Xe2, so 1271
would be a reasonable end version here.

> +		       FUNC(xe_rtp_match_first_render_or_compute)),

This register doesn't exist on platforms like PVC that don't have 3D
functionality (and doesn't really make sense if there's no render).  So
we should probably apply this rule specifically on the render engine
rather than "first render/compute."


Matt

> +	  XE_RTP_ACTIONS(SET(SAMPLER_MODE, INDIRECT_STATE_BASE_ADDR_OVERRIDE))
> +	},
>  	{}
>  };
>  
> -- 
> 2.40.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


More information about the Intel-xe mailing list