[PATCH 3/9] drm/xe/xe_reg_sr: Always call xe_force_wake_put/get in pairs
Rodrigo Vivi
rodrigo.vivi at intel.com
Thu Jun 6 17:46:51 UTC 2024
On Wed, Jun 05, 2024 at 12:49:42PM +0200, Nirmoy Das wrote:
>
> On 6/4/2024 11:13 PM, Lucas De Marchi wrote:
> > On Tue, Jun 04, 2024 at 04:21:25PM GMT, Nirmoy Das wrote:
> > > Hi Michal,
> > >
> > > On 6/4/2024 3:12 PM, Michal Wajdeczko wrote:
> > > >
> > > > On 04.06.2024 13:02, Nirmoy Das wrote:
> > > > > xe_force_wake_get() increments the domain ref regardless of success
> > > > > or failure so call xe_force_wake_put() even on failure to keep ref
> > > > > count accurate.
> > > > >
> > > > > Signed-off-by: Nirmoy Das<nirmoy.das at intel.com>
> > > > > ---
> > > > > drivers/gpu/drm/xe/xe_reg_sr.c | 20 ++++++++------------
> > > > > 1 file changed, 8 insertions(+), 12 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_reg_sr.c
> > > > > b/drivers/gpu/drm/xe/xe_reg_sr.c
> > > > > index 440ac572f6e5..8fcc08658d89 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_reg_sr.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_reg_sr.c
> > > > > @@ -201,13 +201,11 @@ void xe_reg_sr_apply_mmio(struct
> > > > > xe_reg_sr *sr, struct xe_gt *gt)
> > > > > xa_for_each(&sr->xa, reg, entry)
> > > > > apply_one_mmio(gt, entry);
> > > > > - err = xe_force_wake_put(>->mmio.fw, XE_FORCEWAKE_ALL);
> > > > > - XE_WARN_ON(err);
> > > > > -
> > > > > - return;
> > > > > -
> > > > > err_force_wake:
> > > > > - xe_gt_err(gt, "Failed to apply, err=%d\n", err);
> > > > > + if (err)
> > > > > + xe_gt_err(gt, "Failed to whitelist %s registers, err=%d\n",
> > > > > + sr->name, err);
> > > > > + XE_WARN_ON(xe_force_wake_put(>->mmio.fw, XE_FORCEWAKE_ALL));
> > > > what's the rule whether to WARN about the failed xe_force_wake_put() or
> > > > not ? should we also WARN when xe_force_wake_get() failed ? does
> > > > it make
> > > > sense to WARN about put() if get() already failed ?
> > > We don't have one yet.
> > > >
> > > > maybe simpler solution would be make function xe_force_wake_put() void
> > > > as it almost nothing that caller can do and move WARN there if needed ?
> > >
> > > We have code that does "return xe_force_wake_put()" So question is
> > > what shall we do
> > >
> > > if xe_force_wake_get() worked but subsequent xe_force_wake_put() fails ?
> >
> > there's not much can be done here. I even don't think there's a reason
> > to wait on ack for force_wake put... in most of the cases.
> > +Rodrigo, what exactly are we protecting against there?
> >
> > If that xe_mmio_wait32() fails, driver is toast. The reason we check (or
> > should check) the return on _get() is because we want to bail out
> > whatever we are starting to do rather than attempting it and handling
> > bogus values of an IP that is not awaken.
> >
> > So, maybe what could be done is
> >
> > a) XE_WARN_ON(xe_force_wake_put()) -> xe_force_wake_put()
> >
> > We cann probably move the WARN_ON() inside xe_force_wake_put()
> > if we want to maintain the behavior in most places
> >
> > Then make xe_force_wake_put() void as suggested here.
yeap, I believe we have a rough consensus here that put should be void function
and perhaps (likely?!) with some warn inside it.
> >
> > b) I'm not sure about the benefit of the weirdness of
> > xe_force_wake_get() incrementing the ref even if we failed to wait for
> > ack. The only thing the caller is going to do is to call
> > xe_force_wake_put() and bail out. Why are we not unwinding then?
>
> With unwinding of ref count and partial woken domains on error, we should be
> able do
>
>
> if (xe_force_wake_get(fw, d)) {
> ...
> xe_force_wake_put(fw, d);
>
> }
>
> I will let Rodrigo suggest how best to handle this.
I believe the benefit of trying to run the code even when ack timed out
is on clean up cases, since we use this force_wake on a broader scope,
rather then only around the specific mmio calls that we really need it.
So, I'm okay with make this proposed flow as the rule, on always
checking the force_wake_get result before doing operations.
But maybe on top of that it would be worth to scrutinize a bit
the callers and see if we can/should reduce the scope of impact.
>
>
> Thanks,
>
> Nirmoy
>
> >
> >
> > Lucas De Marchi
> >
> > >
> > > Regards,
> > >
> > > Nirmoy
> > >
> > > >
> > > > what about making the flow more intuitive like this:
> > > >
> > > > bool xe_force_wake_get(fw, d);
> > > > void xe_force_wake_put(fw, d);
> > > >
> > > > if (xe_force_wake_get(fw, d)) {
> > > > ...
> > > > xe_force_wake_put(fw, d);
> > > > }
> > > >
> > > > > }
> > > > > void xe_reg_sr_apply_whitelist(struct xe_hw_engine *hwe)
> > > > > @@ -253,13 +251,11 @@ void xe_reg_sr_apply_whitelist(struct
> > > > > xe_hw_engine *hwe)
> > > > > xe_mmio_write32(gt,
> > > > > RING_FORCE_TO_NONPRIV(mmio_base, slot), addr);
> > > > > }
> > > > > - err = xe_force_wake_put(>->mmio.fw, XE_FORCEWAKE_ALL);
> > > > > - XE_WARN_ON(err);
> > > > > -
> > > > > - return;
> > > > > -
> > > > > err_force_wake:
> > > > > - drm_err(&xe->drm, "Failed to apply, err=%d\n", err);
> > > > > + if (err)
> > > > > + xe_gt_err(gt, "Failed to whitelist %s registers, err=%d\n",
> > > > > + sr->name, err);
> > > > > + XE_WARN_ON(xe_force_wake_put(>->mmio.fw, XE_FORCEWAKE_ALL));
> > > > > }
> > > > > /**
More information about the Intel-xe
mailing list