[Intel-xe] [PATCH] drm/xe: Move wa processing to gt_init

Matthew Brost matthew.brost at intel.com
Wed Nov 29 18:24:04 UTC 2023


On Wed, Nov 29, 2023 at 04:57:20PM -0800, Matt Roper wrote:
> On Tue, Nov 28, 2023 at 07:07:39PM -0600, Lucas De Marchi wrote:
> > On Tue, Nov 28, 2023 at 01:55:40PM +0000, Matthew Brost wrote:
> > > On Tue, Nov 28, 2023 at 11:45:36AM -0600, Lucas De Marchi wrote:
> > > > On Tue, Nov 28, 2023 at 08:47:59AM +0000, Matthew Brost wrote:
> > > > > On Tue, Nov 21, 2023 at 08:48:47AM -0600, Lucas De Marchi wrote:
> > > > > > On Tue, Nov 21, 2023 at 04:55:55PM +0530, Tejas Upadhyay wrote:
> > > > > > > Currently WA processing for gt is happening in early init API.
> > > > > > > It also includes processing of WAs which needs filter based on
> > > > > > > hardware engine by looping over number of hardware engines in gt.
> > > > > > >
> > > > > > > However at early init stage hardware engines are not associated
> > > > > > > with gt in software which skips processing such WAs.
> > > > > > >
> > > > > > > Example WAs which will be fixed to apply after this change,
> > > > > > > "14011060649",
> > > > > > > "16010515920",
> > > > > > > "16021867713"
> > > > > > >
> > > > > > > Fixes: 811a9b3d262d8 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
> > > > > > > Signed-off-by: Tejas Upadhyay <tejas.upadhyay at intel.com>
> > > > > > > ---
> > > > > > > drivers/gpu/drm/xe/xe_gt.c | 2 +-
> > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> > > > > > > index 0dddb751c6a4..3d0f7b6f952c 100644
> > > > > > > --- a/drivers/gpu/drm/xe/xe_gt.c
> > > > > > > +++ b/drivers/gpu/drm/xe/xe_gt.c
> > > > > > > @@ -311,7 +311,6 @@ int xe_gt_init_early(struct xe_gt *gt)
> > > > > > > 	if (err)
> > > > > > > 		return err;
> > > > > > >
> > > > > > > -	xe_wa_process_gt(gt);
> > > > > > > 	xe_wa_process_oob(gt);
> > > > > > > 	xe_tuning_process_gt(gt);
> > > > > > >
> > > > > > > @@ -495,6 +494,7 @@ int xe_gt_init(struct xe_gt *gt)
> > > > > > > 	if (err)
> > > > > > > 		return err;
> > > > > > >
> > > > > > > +	xe_wa_process_gt(gt);
> > > > > >
> > > > > > the problem with that is that if we need to touch any register already
> > > > > > touched until this point, the workarounds would not had been applied.
> > > > > > Digging the git history, I had added this commit exactly to be able to
> > > > > > initialize the gt workarounds as soon as possible:
> > > > > >
> > > > > > 	4eb2e8d25ab4 ("drm/xe: Add xe_hw_engines_init_early()")
> > > > > >
> > > > > > And then
> > > > > >
> > > > > > 	816d37907ac6 ("drm/xe: Plug gt WAs to gt init/reset")
> > > > > >
> > > > > > Note that these commits are in the xe branch, not drm-xe-next.
> > > > > > So, it was:
> > > > > >
> > > > > > 	xe_hw_engines_init_early(gt);
> > > > >
> > > > > xe_hw_engines_init_early has be called after xe_uc_init_hwconfig after
> > > > 
> > > > did you mean "has to be" or "has been"?
> > > > 
> > > 
> > > Sorry, "has to be".
> > > 
> > > > > the idea is gt->info.engine_mask would be set by reading the hwconfig
> > > > > table from the GuC rather than from a hardcoded table in the driver.
> > > > 
> > > > like I said, the problem with postponing that is we could have hard to
> > > > debug issues because we will think we have all the workarounds coded,
> > > > not realizing it isn't applied to the early stages of hw initialization.
> > > > 
> > > > I think we should go back push back on the idea of initializing the hw
> > > > engines array after loading guc... it doesn't make much sense from the
> > > > engine initialization pov and we are *not* doing that.
> > > > 
> > > > static int gt_fw_domain_init(struct xe_gt *gt)
> > > > {
> > > > 	...
> > > > 	err = xe_uc_init_hwconfig(&gt->uc);                                   	if
> > > > (err)                                                              		goto
> > > > err_force_wake;
> > > > 
> > > > 	xe_gt_idle_sysfs_init(&gt->gtidle);
> > > > 
> > > > 	/* XXX: Fake that we pull the engine mask from hwconfig blob */
> > > > 	gt->info.engine_mask = gt->info.__engine_mask;
> > > 
> > > Right, we don't currently parse the hwconfig to get the engine_mask but
> > > eventually they should be.
> > > 
> > > > 
> > > > 	...
> > > > }
> > > > 
> > > > The engines available are not hardcoded in the driver. We read the fuses
> > > 
> > > The initial value of info.engine_mask is hardcoded in the driver, the
> > > fuses adjust this. Eventual I think we want to get rid of all platform
> > 
> > but it doesn't matter much. Even if we were to read from the hwconfig,
> > we would still have gt->info.engine_mask with the upper limit and then
> > use the hwconfig to trim it down. How different is it from doing the
> > same thing, but with the fuses? We don't do that today and keep a trim
> > down engine list, but equally we don't use the hwconfig on
> > this part of the initialization. The hwconfig could be useful for
> > something else, but I don't see how it helps here.
> > 
> > > information hardcoded in the driver and dervive info.engine_mask from
> > > the hwconfig. At least that is the goal from my understanding.
> 
> I think this has been mentioned in the past, but there's no expectation
> that we'll ever be able to read engine masks from hwconfig.  In general
> the hwconfig is never supposed to provide device fusing details.  You
> _might_ be able to get counts (e.g., "this SKU should have 3 BCS
> engines") but it will never tell you which specific engine instances are
> the ones that are available.  That information is always expected to
> come from the fuse registers instead.  The same is true for other
> hardware concepts like DSS, EUs, L3 banks, etc.
> 
> The engine masks we define in the device info are the set of engines
> that might ever show up on the platform.  The full set of engines might
> never be available on any SKU of the platform, but the constant engine
> list serves as the starting point which you then pare down by reading
> the appropriate fuse registers.
> 

I think GuC CT channel is needed to read fuses on a VF so for that use
case. So in this case xe_hw_engines_init_early needs to be called after
GuC is loaded with the hwconfig.

Also BTW, maybe we should shelve this patch for now and respin it once
[1][2] is merged as bunch of driver of load is getting reworked by
Michal.

[1] https://patchwork.freedesktop.org/patch/569627/?series=126392&rev=4
[2] https://patchwork.freedesktop.org/series/126392/

> 
> Matt
> 
> > > 
> > > > to know the avilable ones. That is sufficient already to loop through all
> > > > of them. Note that since it's *engine* WAs, we need to pass those to GuC
> > > > via the ads since it's GuC itself that does the save and restore.
> > > > 
> > > 
> > > Yes, but not for hwconfig load. We can load the GuC to read hwconfig and
> > > then properly populate engine WAs list when the GuC is loaded for
> > > submission.
> > > 
> > > > > Likewise for_each_hw_engine should only be called after
> > > > > xe_hw_engines_init_early is called. With that, I think this patch is
> > > > > correct. Does that make sense?
> > > > 
> > > > as per above, no. I think we are overcomplicating it for no real
> > > > benefit.
> > > > 
> > > 
> > > I agree this is complicated but I thought this ordering / reading the
> > > platform information from hwconfig was requirement. Maybe we should
> > > follow up to check if this is indeed a requirement.
> > 
> > What happens when we need a gt workaround to be applied even before the
> > first GuC load? This patch moves all the GT workarounds to happen after
> > the first load so it potentially creates issues if any of the current WAs
> > in gt_was[] are needed before that point.
> >

I'd hope that isn't ever something we have to deal with... If it is,
probably just special case in the hwconfig load.
 
> > The alternative, adding yet another WA type for when we need to decide
> > on the WA by looking at the engines it has, doesn't look pretty and also
> > complicates the decision on where to add new workarounds. When Tejas
> > asked about the possible solutions and I mentioned making the early
> > stuff happen early, I was not thinking about moving the WA processing
> > after the hwconfig, I was thinking about initializing the hw engines
> > array earlier like it was before.
> > 
> > I think the possible things to do are:
> > 
> > 1) move the engine mask to be initialized earlier like it was and stop
> > pretending we are getting it from hwconfig
> > 
> > 2) what this patch does, moving the gt WAs processing after the first
> > guc load
> > 
> > 3) split the gt_was[] - all the WAs that need to check the engine in the
> > GT would need to be applied later
> >

This is probably right option, per the VF use case.

Matt B
 
> > 4) ... something else?
> > 
> > I'd say my preference is (1), (3), (2).
> > 
> > > 
> > > > >
> > > > > I suggested to Tejas that we add an assert to for_each_hw_engine to
> > > > > enforce this is called after xe_hw_engines_init_early to catch issues
> > > > > like this too.
> > > > 
> > > > I think that is already clear. Not sure it deserves an assert. The
> > > > assert would for instance trigger right now in calculate_regset_size(),
> > > > but that is harmless. Also we'd have it spread through the driver.
> > > > 
> > > 
> > > Well an assert like this would caught this bug quite a bit early.
> > 
> > ok, fair enough on this one.... as long as it's behind the debug
> > kconfig, it should be fine.
> > 
> > > 
> > > Matt
> > > 
> > > > Lucas De Marchi
> > > > 
> > > > >
> > > > > > 	...
> > > > > > 	xe_wa_process_gt();
> > > > > >
> > > > > > That gt_fw_domain_init() seems a very poor place to initialize these
> > > > > > things now. It started with commit b892e30c6cd7 ("drm/xe: Move load of
> > > > > > minimal GuC and hwconfig read much earlier") and then people end up
> > > > > > adding random stuff in that function :(
> > > > > >
> > > > > > So I think we need to get back to the original idea of _early()
> > > > > > initializing what's needed. Then we'd probably need a few cleanups in
> > > > > > the initialization sequence so the function names at least partially
> > > > > > matches what's being initialized.
> > > > > >
> > > > >
> > > > > I agree on that the naming of all of this could be better...
> > > > >
> > > > > Matt
> > > > >
> > > > > > Lucas De Marchi
> > > > > >
> > > > > > > 	xe_force_wake_init_engines(gt, gt_to_fw(gt));
> > > > > > >
> > > > > > > 	err = all_fw_domain_init(gt);
> > > > > > > --
> > > > > > > 2.25.1
> > > > > > >
> 
> -- 
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation


More information about the Intel-xe mailing list