[PATCH v2 2/2] drm/i915/gt: Enable only one CCS for compute workload
Matt Roper
matthew.d.roper at intel.com
Wed Feb 21 20:51:04 UTC 2024
On Wed, Feb 21, 2024 at 01:12:18AM +0100, Andi Shyti wrote:
> Hi Matt,
>
> thanks a lot for looking into this.
>
> On Tue, Feb 20, 2024 at 03:39:18PM -0800, Matt Roper wrote:
> > On Tue, Feb 20, 2024 at 03:35:26PM +0100, Andi Shyti wrote:
>
> [...]
>
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> > > index 833987015b8b..7041acc77810 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> > > @@ -243,6 +243,15 @@ void intel_engines_driver_register(struct drm_i915_private *i915)
> > > if (engine->uabi_class == I915_NO_UABI_CLASS)
> > > continue;
> > >
> > > + /*
> > > + * Do not list and do not count CCS engines other than the first
> > > + */
> > > + if (engine->uabi_class == I915_ENGINE_CLASS_COMPUTE &&
> > > + engine->uabi_instance > 0) {
> > > + i915->engine_uabi_class_count[engine->uabi_class]--;
> > > + continue;
> > > + }
> >
> > Wouldn't it be simpler to just add a workaround to the end of
> > engine_mask_apply_compute_fuses() if we want to ensure only a single
> > compute engine gets exposed? Then both the driver internals and uapi
> > will agree that's there's just one CCS (and on which one there is).
> >
> > If we want to do something fancy with "hotplugging" a new engine later
> > on or whatever, that can be handled in the future series (although as
> > noted on the previous patch, it sounds like these changes might not
> > actually be aligned with the workaround we were trying to address).
>
> The hotplugging capability is one of the features I was looking
> for, actually.
>
> I have done some more refactoring in this piece of code in
> upcoming patches.
>
> Will check, though, if I can do something with compute_fuses(),
> even though, the other cslices are not really fused off (read
> below).
>
> > > +
> > > rb_link_node(&engine->uabi_node, prev, p);
> > > rb_insert_color(&engine->uabi_node, &i915->uabi_engines);
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
> > > index a425db5ed3a2..e19df4ef47f6 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> > > @@ -168,6 +168,14 @@ static void init_unused_rings(struct intel_gt *gt)
> > > }
> > > }
> > >
> > > +static void intel_gt_apply_ccs_mode(struct intel_gt *gt)
> > > +{
> > > + if (!IS_DG2(gt->i915))
> > > + return;
> > > +
> > > + intel_uncore_write(gt->uncore, XEHP_CCS_MODE, 0);
> >
> > This doesn't look right to me. A value of 0 means every cslice gets
> > associated with CCS0.
>
> Yes, that's what I'm trying to do. The behavior I'm looking for
> is this one:
>
> /*
> ...
> * With 1 engine (ccs0):
> * slice 0, 1, 2, 3: ccs0
> *
> * With 2 engines (ccs0, ccs1):
> * slice 0, 2: ccs0
> * slice 1, 3: ccs1
> *
> * With 4 engines (ccs0, ccs1, ccs2, ccs3):
> * slice 0: ccs0
> * slice 1: ccs1
> * slice 2: ccs2
> * slice 3: ccs3
> ...
> */
>
> where the user can configure runtime the mode, making sure that
> no client is connected to i915.
>
> But, this needs to be written
>
> As we are now forcing mode '1', then all cslices are connected
> with ccs0.
Right --- and that's what I'm pointing out as illegal. I think that
code comment above was taken out of context from a different RFC series;
that's not an accurate description of the behavior we want here.
First, the above comment is using ccs# to refer to userspace engines,
not hardware engines. As a simple example, DG2-G11 only ever has a
single CCS which userspace sees as "instance 0" but which is actually
CCS1 at the hardware level. If you try to follow the comment above when
programming CCS_MODE, you've assigned all of the cslices to a
non-existent engine and assigned no cslices to the CCS engine that
actually exists. For DG2-G10 (and I think DG2-G12), there are different
combinations of fused-off / not-fused-off engines that will always show
up in userspace as CCS0-CCSn, even if those don't match the hardware
IDs.
Second, the above comment is assuming that you have a part with a
maximum fusing config (i.e., all cslices present). Using DG2-G11 again
as an example, there's also only a single cslice (cslice1), so if you
tell CCS1 that it's allowed to use EUs from non-existent cslice0,
cslice2, and cslice3, you might not get the behavior you were hoping
for.
>
> > On a DG2-G11 platform, that will flat out break
> > compute since CCS0 is never present (G11 only has a single CCS and it's
> > always the hardware's CCS1). Even on a G10 or G12 this could also break
> > things depending on the fusing of your card if the hardware CCS0 happens
> > to be missing.
> >
> > Also, the register says that we need a field value of 0x7 for each
> > cslice that's fused off. By passing 0, we're telling the CCS engine
> > that it can use cslices that may not actually exist.
>
> does it? Or do I need to write the id (0x0-0x3) of the user
> engine? That's how the mode is calculated in this algorithm.
0x0 - 0x3 are how you specify that a specific CCS engine can use the
cslice. If the cslice can't be used at all (i.e., it's fused off), then
you need to program a 0x7 to ensure no engines try to use the
non-existent DSS/EUs.
Matt
>
> > > +}
> > > +
>
> [...]
>
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > > index cf709f6c05ae..c148113770ea 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
> > > @@ -1605,6 +1605,8 @@
> > > #define GEN12_VOLTAGE_MASK REG_GENMASK(10, 0)
> > > #define GEN12_CAGF_MASK REG_GENMASK(19, 11)
> > >
> > > +#define XEHP_CCS_MODE _MMIO(0x14804)
> >
> > Nitpick: this doesn't seem to be in the proper place and also breaks
> > the file's convention of using tabs to move over to column 48 for the
> > definition value.
>
> This was something I actually forgot to fix. Thanks!
--
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation
More information about the dri-devel
mailing list