[PATCH 1/4] drm/xe: Fix GT "for each engine" workarounds
Matt Roper
matthew.d.roper at intel.com
Thu Feb 27 17:35:01 UTC 2025
On Wed, Feb 26, 2025 at 10:47:52PM -0600, Lucas De Marchi wrote:
> On Wed, Feb 26, 2025 at 01:59:44PM -0800, Matt Roper wrote:
> > On Wed, Feb 26, 2025 at 11:20:16AM -0600, Lucas De Marchi wrote:
> > > On Wed, Feb 26, 2025 at 11:21:03AM +0000, Tvrtko Ursulin wrote:
> > > > Any rules using engine matching are currently broken due RTP processing
> > > > happening too in early init, before the list of hardware engines has been
> > > > initialised.
> > > >
> > > > Fix this by moving workaround processing to later in the driver probe
> > > > sequence, to just before the processed list is used for the first time.
> > > >
> > > > Looking at the debugfs gt0/workarounds on ADL-P we notice 14011060649
> > > > should be present while we see, before:
> > > >
> > > > GT Workarounds
> > > > 14011059788
> > > > 14015795083
> > > >
> > > > And with the patch:
> > > >
> > > > GT Workarounds
> > > > 14011060649
> > > > 14011059788
> > > > 14015795083
> > > >
> > > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
> > > > Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> > > > Cc: Matt Roper <matthew.d.roper at intel.com>
> > >
> > > Cc: <stable at vger.kernel.org> # v6.11+
> > > Reviewed-by: Lucas De Marchi <lucas.demarchi at intel.com>
> > >
> > > The version in Cc stable is because it's what matters for officially
> > > supported HW by xe.
> > >
> > > We've been using the reset domain to decide if a WA is an engine vs gt
> > > workaround. Although I'm thinking if it wouldn't be better to simply
> > > drop the FOREACH_ENGINE processing and move those workarounds to be
> > > engine workarounds. Matt, any thoughts?
> >
> > That's a common mistake we've made in the past --- it doesn't work. If
> > you try to put a bunch of these registers onto the GuC's save/restore
> > list as happens with engine workarounds, the GuC will throw an assertion
> > and fail to load because it considers them to be outside the very narrow
> > set of register ranges it's allowed to work with.
>
> but that's a GuC issue that we would try to fix? at least for new
> platforms. if the register is in the mmio space for
> that engine, why would GuC do that?
No, that's intentional behavior. Just because something is inside a
register's MMIO space doesn't mean the GuC should be re-programming it
on resets. The proper behavior for re-applying workarounds really comes
down to the register's reset behavior (whether it's wiped by resets or
not) and that's based on the sub-unit of hardware it lives inside, not
where it falls in the MMIO space. If the KMD asks the GuC to reprogram
something that it shouldn't be, that's considered a KMD bug and the
GuC's assertion is there to catch that. It's a safety net to make sure
the GuC isn't accidentally touching parts of hardware it has no need to.
I might be misremembering, but I seem to recall that there may even be
some hardware enforcement to prevent the GuC microcontroller from
touching such register ranges on certain platforms.
Matt
>
> Lucas De Marchi
>
> >
> > So no, unfortunately these do need to remain "GT" workarounds, even
> > though they need to be applied to registers that have per-engine
> > instances. :-(
> >
> >
> > Matt
> >
> > >
> > > thanks
> > > Lucas De Marchi
> > >
> > > > ---
> > > > drivers/gpu/drm/xe/xe_gt.c | 4 ++--
> > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> > > > index 650a0ee56e97..d59c03bc05b7 100644
> > > > --- a/drivers/gpu/drm/xe/xe_gt.c
> > > > +++ b/drivers/gpu/drm/xe/xe_gt.c
> > > > @@ -361,9 +361,7 @@ int xe_gt_init_early(struct xe_gt *gt)
> > > > if (err)
> > > > return err;
> > > >
> > > > - xe_wa_process_gt(gt);
> > > > xe_wa_process_oob(gt);
> > > > - xe_tuning_process_gt(gt);
> > > >
> > > > xe_force_wake_init_gt(gt, gt_to_fw(gt));
> > > > spin_lock_init(>->global_invl_lock);
> > > > @@ -450,6 +448,8 @@ static int all_fw_domain_init(struct xe_gt *gt)
> > > > }
> > > >
> > > > xe_gt_mcr_set_implicit_defaults(gt);
> > > > + xe_wa_process_gt(gt);
> > > > + xe_tuning_process_gt(gt);
> > > > xe_reg_sr_apply_mmio(>->reg_sr, gt);
> > > >
> > > > err = xe_gt_clock_init(gt);
> > > > --
> > > > 2.48.0
> > > >
> >
> > --
> > Matt Roper
> > Graphics Software Engineer
> > Linux GPU Platform Enablement
> > Intel Corporation
--
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation
More information about the Intel-xe
mailing list