[RFC 03/34] drm/xe: Fix display runtime_pm handling

Rodrigo Vivi rodrigo.vivi at intel.com
Thu Feb 15 22:19:05 UTC 2024


On Thu, Feb 15, 2024 at 09:30:08AM +0000, Matthew Auld wrote:
> On 14/02/2024 18:05, Rodrigo Vivi wrote:
> > On Mon, Feb 05, 2024 at 09:11:37AM +0000, Matthew Auld wrote:
> > > On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > > > i915's intel_runtime_pm_get_if_in_use actually calls the
> > > > pm_runtime_get_if_active() with ign_usage_count = false, but Xe
> > > > was erroneously calling it with true because of the mem_access cases.
> > >
> > > Good catch.
> > >
> > > > This can lead to unbalanced references.
> > >
> > > Is there an actual imbalance here though? Is it not just a case of being
> > > overzealous in keeping the device awake when it is not currently "in_use" vs
> > > "if_active"? If the api increments the usage count we will still decrement
> > > it later, regardless of active vs in-use, AFAICT.
> >
> > Indeed.
> >
> > s/This can lead to unbalanced references./This can lead to unnecessary
> > references getting hold here and device never getting into the runtime
> > suspended state./g
> >
> > >
> > > >
> > > > Let's use directly the 'if_in_use' function provided by linux/pm_runtime.
> > > >
> > > > Also, already start this new function protected from the runtime
> > > > recursion, since runtime_pm will need to call for display functions
> > > > for a proper D3Cold flow.
> > > >
> > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > > ---
> > > >    .../gpu/drm/xe/compat-i915-headers/i915_drv.h   |  2 +-
> > > >    drivers/gpu/drm/xe/xe_pm.c                      | 17 +++++++++++++++++
> > > >    drivers/gpu/drm/xe/xe_pm.h                      |  1 +
> > > >    3 files changed, 19 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> > > > index 420eba0e4be0..ad5864d1dd74 100644
> > > > --- a/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> > > > +++ b/drivers/gpu/drm/xe/compat-i915-headers/i915_drv.h
> > > > @@ -177,7 +177,7 @@ static inline intel_wakeref_t intel_runtime_pm_get_if_in_use(struct xe_runtime_p
> > > >    {
> > > >    	struct xe_device *xe = container_of(pm, struct xe_device, runtime_pm);
> > > > -	return xe_pm_runtime_get_if_active(xe);
> > > > +	return xe_pm_runtime_get_if_in_use(xe);
> > > >    }
> > > >    static inline void intel_runtime_pm_put_unchecked(struct xe_runtime_pm *pm)
> > > > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > > > index bd35fe9f6227..19f88cb7715b 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pm.c
> > > > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > > > @@ -417,6 +417,23 @@ int xe_pm_runtime_get_if_active(struct xe_device *xe)
> > > >    	return pm_runtime_get_if_active(xe->drm.dev, true);
> > > >    }
> > > > +/**
> > > > + * xe_pm_runtime_get_if_in_use - Get a runtime_pm reference and resume if needed
> > > > + * @xe: xe device instance
> > > > + *
> > > > + * Returns: True if device is awake and the reference was taken, false otherwise.
> > > > + */
> > > > +bool xe_pm_runtime_get_if_in_use(struct xe_device *xe)
> > > > +{
> > > > +	if (xe_pm_read_callback_task(xe) == current) {
> > > > +		/* The device is awake, grab the ref and move on */
> > > > +		pm_runtime_get_noresume(xe->drm.dev);
> > > > +		return true;
> > > > +	}
> > > > +
> > > > +	return pm_runtime_get_if_in_use(xe->drm.dev) >= 0;
> > >
> > > This is doing atomic_inc_not_zero() underneath for the "in_use" case AFAICT.
> > > If the usage count is zero it doesn't increment it and returns 0. Does that
> > > not lead to an imbalance? Should this rather be > 0?
>
> ^^ did you see the comment here also?

oh, I had missed this comment, I'm sorry!
And good catch as well... maybe this will even solve some sporadic remaining underflow
that I was seeing...

but not just here but also in the if_active version as well.

doc for pm_runtime_get_if_active():
"
 * The caller is responsible for decrementing the runtime PM usage counter of
 * @dev after this function has returned a positive value for it.
"

Thanks a lot,
Rodrigo.

>
> > >
> > > > +}
> > > > +
> > > >    /**
> > > >     * xe_pm_assert_unbounded_bridge - Disable PM on unbounded pcie parent bridge
> > > >     * @xe: xe device instance
> > > > diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> > > > index 64a97c6726a7..9d372cbf388b 100644
> > > > --- a/drivers/gpu/drm/xe/xe_pm.h
> > > > +++ b/drivers/gpu/drm/xe/xe_pm.h
> > > > @@ -28,6 +28,7 @@ int xe_pm_runtime_resume(struct xe_device *xe);
> > > >    int xe_pm_runtime_get(struct xe_device *xe);
> > > >    int xe_pm_runtime_put(struct xe_device *xe);
> > > >    int xe_pm_runtime_get_if_active(struct xe_device *xe);
> > > > +bool xe_pm_runtime_get_if_in_use(struct xe_device *xe);
> > > >    void xe_pm_assert_unbounded_bridge(struct xe_device *xe);
> > > >    int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold);
> > > >    void xe_pm_d3cold_allowed_toggle(struct xe_device *xe);


More information about the Intel-xe mailing list