[PATCH v2 2/2] drm/i915/gt: Enable only one CCS for compute workload
Matt Roper
matthew.d.roper at intel.com
Tue Feb 20 23:39:18 UTC 2024
On Tue, Feb 20, 2024 at 03:35:26PM +0100, Andi Shyti wrote:
> Enable only one CCS engine by default with all the compute sices
> allocated to it.
>
> While generating the list of UABI engines to be exposed to the
> user, exclude any additional CCS engines beyond the first
> instance.
>
> This change can be tested with igt i915_query.
>
> Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
> Signed-off-by: Andi Shyti <andi.shyti at linux.intel.com>
> Cc: Chris Wilson <chris.p.wilson at linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Matt Roper <matthew.d.roper at intel.com>
> Cc: <stable at vger.kernel.org> # v6.2+
> ---
> drivers/gpu/drm/i915/gt/intel_engine_user.c | 9 +++++++++
> drivers/gpu/drm/i915/gt/intel_gt.c | 11 +++++++++++
> drivers/gpu/drm/i915/gt/intel_gt_regs.h | 2 ++
> drivers/gpu/drm/i915/i915_query.c | 1 +
> 4 files changed, 23 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> index 833987015b8b..7041acc77810 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> @@ -243,6 +243,15 @@ void intel_engines_driver_register(struct drm_i915_private *i915)
> if (engine->uabi_class == I915_NO_UABI_CLASS)
> continue;
>
> + /*
> + * Do not list and do not count CCS engines other than the first
> + */
> + if (engine->uabi_class == I915_ENGINE_CLASS_COMPUTE &&
> + engine->uabi_instance > 0) {
> + i915->engine_uabi_class_count[engine->uabi_class]--;
> + continue;
> + }
Wouldn't it be simpler to just add a workaround to the end of
engine_mask_apply_compute_fuses() if we want to ensure only a single
compute engine gets exposed? Then both the driver internals and uapi
will agree that's there's just one CCS (and on which one there is).
If we want to do something fancy with "hotplugging" a new engine later
on or whatever, that can be handled in the future series (although as
noted on the previous patch, it sounds like these changes might not
actually be aligned with the workaround we were trying to address).
> +
> rb_link_node(&engine->uabi_node, prev, p);
> rb_insert_color(&engine->uabi_node, &i915->uabi_engines);
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> index a425db5ed3a2..e19df4ef47f6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -168,6 +168,14 @@ static void init_unused_rings(struct intel_gt *gt)
> }
> }
>
> +static void intel_gt_apply_ccs_mode(struct intel_gt *gt)
> +{
> + if (!IS_DG2(gt->i915))
> + return;
> +
> + intel_uncore_write(gt->uncore, XEHP_CCS_MODE, 0);
This doesn't look right to me. A value of 0 means every cslice gets
associated with CCS0. On a DG2-G11 platform, that will flat out break
compute since CCS0 is never present (G11 only has a single CCS and it's
always the hardware's CCS1). Even on a G10 or G12 this could also break
things depending on the fusing of your card if the hardware CCS0 happens
to be missing.
Also, the register says that we need a field value of 0x7 for each
cslice that's fused off. By passing 0, we're telling the CCS engine
that it can use cslices that may not actually exist.
> +}
> +
> int intel_gt_init_hw(struct intel_gt *gt)
> {
> struct drm_i915_private *i915 = gt->i915;
> @@ -195,6 +203,9 @@ int intel_gt_init_hw(struct intel_gt *gt)
>
> intel_gt_init_swizzling(gt);
>
> + /* Configure CCS mode */
> + intel_gt_apply_ccs_mode(gt);
> +
> /*
> * At least 830 can leave some of the unused rings
> * "active" (ie. head != tail) after resume which
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> index cf709f6c05ae..c148113770ea 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> @@ -1605,6 +1605,8 @@
> #define GEN12_VOLTAGE_MASK REG_GENMASK(10, 0)
> #define GEN12_CAGF_MASK REG_GENMASK(19, 11)
>
> +#define XEHP_CCS_MODE _MMIO(0x14804)
Nitpick: this doesn't seem to be in the proper place and also breaks
the file's convention of using tabs to move over to column 48 for the
definition value.
Matt
> +
> #define GEN11_GT_INTR_DW(x) _MMIO(0x190018 + ((x) * 4))
> #define GEN11_CSME (31)
> #define GEN12_HECI_2 (30)
> diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
> index 3baa2f54a86e..d5a5143971f5 100644
> --- a/drivers/gpu/drm/i915/i915_query.c
> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -124,6 +124,7 @@ static int query_geometry_subslices(struct drm_i915_private *i915,
> return fill_topology_info(sseu, query_item, sseu->geometry_subslice_mask);
> }
>
> +
> static int
> query_engine_info(struct drm_i915_private *i915,
> struct drm_i915_query_item *query_item)
> --
> 2.43.0
>
--
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation
More information about the dri-devel
mailing list