[Intel-gfx] [RFC 0/6] Haswell runtime PM support + D3

Mon Oct 28 21:02:45 CET 2013

On Mon, 28 Oct 2013 17:10:02 +0100
Daniel Vetter <daniel at ffwll.ch> wrote:

> On Mon, Oct 28, 2013 at 03:09:11PM +0200, Imre Deak wrote:
> > Hi,
> > 
> > On Tue, 2013-10-22 at 17:30 -0200, Paulo Zanoni wrote:
> > > From: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > > 
> > > Hi
> > > 
> > > This RFC series adds runtime PM support on Haswell. The current implementation
> > > puts the device in the PCI D3cold state when we decide to sleep. It uses the
> > > same refcount+timeout idea from the PC8 code, but now through the Kernel runtime
> > > PM infrastructure.
> > > 
> > > I saw Jesse and Imre are still playing with the power wells on Baytrail, so I
> > > thought maybe they would take some time to do the whole D3 + runtime_pm thing,
> > > so I decided to ressurrect some code I had, test it and send anyway.
> > > 
> > > The basic idea of this series is that it adds some functions that should be
> > > reused when we add support for other platforms, but OTOH it completely relies on
> > > the already-implemented PC8 support for saving+restoring everything on Haswell.
> > > So even though the code is ready for Haswell, it's not ready for !Haswell.
> > > 
> > > So we'll only reach the zero refcount after we enable PC8, and we'll increment
> > > the refcount just before disabling PC8. I thought we would need more register
> > > save-restore code, but it seems the current save-restore code from PC8 is enough
> > > to keep everything going. All I needed was to add some more put/get calls to
> > > keep the device awake when needed.
> > > 
> > > Now here's the reason why this is still an RFC: I just added put/get calls to
> > > the places I spotted problems while doing my basic tests. I basically tested
> > > runtime PM while running Gnome and Xfce, and I also tested sysfs and debugfs. I
> > > still didn't write the full test suite I've been promising to write: this will
> > > be my next step. OTOH, the runtime PM support is disabled by default, so merging
> > > these patches shouldn't really hurt.
> > > 
> > > But our driver has so many entry points that I just don't think we'll ever be
> > > confident that we always wake up from D3 when needed, and I don't think we'll be
> > > able to write a single test case that tests everything. So I fear this feature
> > > will be just like the "unclaimed register" checking code: we'll be stuck on a
> > > break-bisect-fix loop forever. And considering most people won't ever run the
> > > runtime PM code, we may stay long periods of time without noticing the broken
> > > state.
> > > 
> > > When we add support for BYT and other platforms we will have to rearrange the
> > > PC8 code so that we move some things out of it to some common PM infrastructure
> > > (i.e, dev_priv->pm). The first things I can think of are the code that counts
> > > the current number of used CRTCs and the code that handles pc8.gpu_idle.
> > > 
> > > The interfaces for configuring the runtime PM support are the standard runtime
> > > PM interfaces. Go to /sys/bus/pci/devices/0000\:00\:02.0/power/ and have some
> > > fun. I suggest you to "echo 5 > autosuspend_delay_ms" and boot with
> > > "i915.pc8_timeout=5" if you really want to test things.
> > > 
> > > Another thing which we could discuss is if we really want to keep PC8 and D3 as
> > > separate features. Today they're separate features, where D3 is a "deeper" step
> > > that can be disabled while PC8 is still enabled. This creates the problem that
> > > we need to separately test/validate PC8-only and PC8+D3 (and possibly D3-only in
> > > the future, depending on how we rearrange code). Since I can't really think of a
> > > case where we would want PC8 but not D3, maybe we should kill the "PC8 without
> > > D3" option.
> > > 
> > > Opinions? Bikesheds? Does this really looks like it can be resued on BYT? Jesse?
> > > Imre?
> > 
> > I read through the patchset, in general it looks good to me. I agree it
> > would make sense to merge already now at least the parts that would
> > enable others to sort out platform specific details like HW
> > state/restore for !HSW.
> > 
> > I'm also wondering where to place the get/put calls, here is my 2 cents
> > about this, please correct me where I'm wrong: With RPM we want to
> > control whether the whole device is in D0 or D3/D3_cold state, which is
> > the coarsest granularity. While in D0 we also need a finer granularity
> > control to gate as many clocks and power domains as we can (while others
> > remain active preventing D3/D3_cold). For this we enable/disable PC8/RC6
> > to let the HW do automatic gating, or we do this manually for example
> > through power domains control.
> > 
> > If this is correct we should try our best to only do fine grain control
> > at the highest level and do RPM get/put only as a consequence. For
> > example modeset code would only enable/disable whatever power domains
> > are needed for the current mode and the power domain framework would do
> > the RPM get/put. Your PC8 changes in your patchset are in accordance
> > with this: at the highest level we only enable/disable PC8 which will do
> > an RPM get/put if needed. Debugfs and other places where we need
> > register access: atm you do only an RPM get/put around these, but this
> > just ensures we are in D0, not that we have the needed power
> > domains/clocks enabled. I guess this works for HSW, but it may not work
> > on other platforms. One solution could be to do the same here as for
> > modeset, we just enable/disable clocks and power domains around these
> > places as needed and adjust the RPM refcount only as a consequence.
> 
> I haven't read through the patches, but overall we should only get/put the
> most fine-grained power domain possible. If there's nesting going on (e.g.
> nested power wells, or different runtime pm states like pc8/D3/...) then
> the fine-grained power domains should in turn grab references for the next
> level up. So if someone grabs the aux display references to do edid reads
> (which on hsw disables pc8), this will also block D3. Same if we need one
> of the display power wells, that should in turn prevent pc8 (again on
> hsw).
> 
> I know that the current code isn't there at all. But long-term this should
> be the direction imo, otherwise we'll have a giant mess between the
> runtime pm handling and the get/put callsites.

Yep, agreed.  Fortunately Paulo has tested this and identified where
the get/puts belong, so now it's mainly a matter of figuring out which
power well needs to be referenced, rather than referencing the whole
device.

On other platforms we may need to add more get/puts if we have
new/different power wells, but let's start with HSW and work from there.

-- 
Jesse Barnes, Intel Open Source Technology Center