[Intel-gfx] [PATCH 07/10] drm/i915: add support for checking if we hold an RPM reference

Imre Deak imre.deak at intel.com
Tue Dec 15 14:52:34 PST 2015


On Tue, 2015-12-15 at 21:07 +0000, Chris Wilson wrote:
> On Tue, Dec 15, 2015 at 08:10:35PM +0200, Imre Deak wrote:
> > Atm, we assert that the device is not suspended until the point
> > when the
> > device is truly put to a suspended state. This is fine, but we can
> > catch
> > more problems if we check that RPM refcount is non-zero. After that
> > one
> > drops to zero we shouldn't access the device any more, even if the
> > actual
> > device suspend may be delayed. Change assert_rpm_wakelock_held()
> > accordingly to check for a non-zero RPM refcount in addition to the
> > current device-not-suspended check.
> > 
> > For the new asserts to work we need to annotate every place
> > explicitly in
> > the code where we expect that the device is powered. The places
> > where we
> > only assume this, but may not hold an RPM reference:
> > - driver load
> >   We assume the device to be powered until we enable RPM. Make this
> >   explicit by taking an RPM reference around the load function.
> > - system and runtime sudpend/resume handlers
> >   These handlers are called when the RPM reference becomes 0 and
> > know the
> >   exact point after which the device can get powered off. Disable
> > the
> >   RPM-reference-held check for their duration.
> > - the IRQ, hangcheck and RPS work handlers
> >   These handlers are flushed in the system/runtime suspend handler
> >   before the device is powered off, so it's guaranteed that they
> > won't
> >   run while the device is powered off even though they don't hold
> > any
> >   RPM reference. Disable the RPM-reference-held check for their
> > duration.
> 
> My current thinking is that the hangcheck/RPS tasks are wrong - and
> that
> we do actually have explicit wakerefs that should cover their
> lifetimes
> (but we fail to actually terminate them when we drop the associated
> wakeref).
> 
> With respect to the current state (cancelling the work in
> rpm_suspend),
> the assert disabling is correct, but I think we should be indicating
> that we papering over a "bug" more strongly.
> 
> i.e. something like DISABLE_RPM_WAKEREF_ASSERT();

But the other cases are still legitimate, so we'd keep the lower case
name for those and define the above macro as an alias simply to
emphasize the difference?

> -Chris
> 


More information about the Intel-gfx mailing list