[Intel-gfx] [PATCH v2 3/4] drm/i915: make assert_device_not_suspended more precise

Imre Deak imre.deak at intel.com
Tue Nov 10 01:47:19 PST 2015


On ma, 2015-11-09 at 21:44 +0000, Chris Wilson wrote:
> On Mon, Nov 09, 2015 at 09:13:45PM +0200, Imre Deak wrote:
> > Atm, we assert that the device is not suspended after the point
> > when the
> > HW is truly put to a suspended state. This is fine, but we can
> > catch
> > more problems if we check the RPM refcount. After that one drops to
> > zero
> > we shouldn't access the HW any more, although the actual suspend
> > may be
> > delayed. The only complication is that we want to avoid asserts
> > while
> > the suspend handler itself is running, so add a flag to handle this
> > case.
> > 
> > While at it remove the HAS_RUNTIME_PM check, the pm.suspended flag
> > is
> > false and the RPM refcount is non-zero on all platforms that don't
> > support RPM.
> > 
> > This caught additional WARNs from the atomic path, those will be
> > fixed
> > as a follow-up.
> > 
> > v2:
> > - remove the redundant HAS_RUNTIME_PM check (Ville)
> > 
> > Signed-off-by: Imre Deak <imre.deak at intel.com>
> > ---
> > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > @@ -2120,8 +2120,18 @@ void intel_power_domains_init_hw(struct
> > drm_i915_private *dev_priv, bool resume)
> >  
> >  void assert_device_not_suspended(struct drm_i915_private
> > *dev_priv)
> >  {
> > -	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv
> > ->pm.suspended,
> > -		  "Device suspended\n");
> > +	int rpm_usage;
> > +
> > +	if (dev_priv->pm.disable_suspended_assert)
> > +		return;
> > +
> > +#ifdef CONFIG_PM
> > +	rpm_usage = atomic_read(&dev_priv->dev->dev
> > ->power.usage_count);
> > +#else
> > +	rpm_usage = 1;
> > +#endif
> 
> Whilst this should fix the issue I was worried about, I think for
> e.g.
> the GGTT PTE access, we should be checking that we have a rpm ref
> (i.e.
> we have called intel_runtime_pm_get()).

Right, didn't think of that. Also we don't have to then to access RPM
internals.

> Bonus points if we can narrow
> that down to being inside an rpm critical section (made tricky
> because
> the wakelocks can nest :(. The simplest way does impose an extra
> atomic
> inc/dec simply for debug purposes, on the other hand it shouldn't
> then
> need the pm.disable_suspended_assert and you can have an extra assert
> that we cannot set pm.suspended whilst intel_runtime_pm_get() is
> held.

Ok, this was interesting and made me look at the current users of RPM.
We have two distinct use cases with different semantics: one in which
we take an RPM ref for a prolonged time and not necessarily release it
in the same context (thread). For example while the GT or any display
outputs are active. The other scenario is for short tasks like
programming the GGTT PTE, or anything else where we wrap something
between a get and a put in the same context. In this case the ref will
be released from the same context as where it was taken. These two use
cases are worth making explicit in the API imo, both for documentation
and for additional sanity checking: in the prolonged case we can only
check for the refcount not to get unbalanced globally (since the get
and put can happen in different contexts). In the case we take the ref
for a short time we can additionally check for the critical section
guarantee, plus that there is no inbalance from each context's POV. All
of these additional checks can be provided by lockdep.


> Otherwise, yeah just rename this to imply we aren't just checking
that
> the device isn't suspended right now, but cannot be.

Ok, I put all the fixes for the review comments so far and the lockdep
thing on top to:
https://github.com/ideak/linux/commits/rpm-assert-improvements

I can post these if there are no objections.

--Imre


> -Chris
> 


More information about the Intel-gfx mailing list