[PATCH 1/7] PCI: Restore BARs on runtime resume despite being unbound

Rafael J. Wysocki rjw at rjwysocki.net
Wed Feb 21 12:39:34 UTC 2018


On Wednesday, February 21, 2018 10:57:14 AM CET Rafael J. Wysocki wrote:
> On Tue, Feb 20, 2018 at 10:29 PM, Bjorn Helgaas <helgaas at kernel.org> wrote:
> > On Sun, Feb 18, 2018 at 09:38:32AM +0100, Lukas Wunner wrote:
> >> PCI devices not bound to a driver are supposed to stay in D0 during
> >> runtime suspend.
> >
> > Doesn't "runtime suspend" mean an individual device is suspended while
> > the rest of the system remains active?
> >
> > If so, maybe it would be more direct to say "PCI devices not bound to
> > a driver cannot be runtime suspended"?
> >
> > (It's a separate question not relevant to this patch, but I'm not
> > convinced that "PCI devices without a driver cannot be suspended"
> > should be accepted as a rule.  If it is a rule, we should be able to
> > deduce it from the specs.)
> >
> >> But they may have a parent which is bound and can be
> >> transitioned to D3cold at runtime.  Once the parent goes to D3cold, the
> >> unbound child may go to D3cold as well.  When the child comes out of
> >> D3cold, its BARs are uninitialized and thus inaccessible when a driver
> >> tries to probe.
> >>
> >> One example are recent hybrid graphics laptops which cut power to the
> >> discrete GPU when the root port above it goes to ACPI power state D3.
> >> Users may provoke this by unbinding the GPU driver and allowing runtime
> >> PM on the GPU via sysfs:  The PM core will then treat the GPU as
> >> "suspended", which in turn allows the root port to runtime suspend,
> >> causing the power resources listed in its _PR3 object to be powered off.
> >> The GPU's BARs will be uninitialized when a driver later probes it.
> >>
> >> Another example are hybrid graphics laptops where the GPU itself (rather
> >> than the root port) is capable of runtime suspending to D3cold.  If the
> >> GPU's integrated HDA controller is not bound and the GPU's driver
> >> decides to runtime suspend to D3cold, the HDA controller's BARs will be
> >> uninitialized when a driver later probes it.
> >>
> >> Fix by restoring the BARs on runtime resume if the device is not bound.
> >> This is sufficient to fix the above-mentioned use cases.  Other use
> >> cases might require a full-blown pci_save_state() / pci_restore_state()
> >> or execution of fixups.  We can add that once use cases materialize,
> >> let's not inflate the code unnecessarily.
> >
> > Why would we not do a full-blown restore?  With this patch, I think
> > some configuration done during enumeration, e.g., ASPM and MPS, will
> > be lost.  "lspci -vv" of the HDA before and after the suspend may show
> > different things, which seems counter-intuitive.
> 
> Right.
> 
> > I wouldn't think of a full-blown restore as "inflating the code"; I
> > would think of it as "having fewer special cases", i.e., we always use
> > the same restore path instead of having the main one plus a special
> > one for unbound devices.
> 
> That is a fair point, but if you look at pci_pm_runtime_suspend(), it
> doesn't actually save anything for devices without drivers, so we'd
> just restore the original initial state for them every time.
> 
> If we are to restore the entire state on runtime resume,
> pci_pm_runtime_suspend() should be changed to call pci_save_state()
> before returning 0 in the !dev->driver case AFAICS.
> 
> >> Cc: Bjorn Helgaas <bhelgaas at google.com>
> >> Cc: Rafael J. Wysocki <rafael.j.wysocki at intel.com>
> >> Signed-off-by: Lukas Wunner <lukas at wunner.de>
> >> ---
> >>  drivers/pci/pci-driver.c | 8 ++++++--
> >>  drivers/pci/pci.c        | 2 +-
> >>  drivers/pci/pci.h        | 1 +
> >>  3 files changed, 8 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> >> index 3bed6beda051..51b11cbd48f6 100644
> >> --- a/drivers/pci/pci-driver.c
> >> +++ b/drivers/pci/pci-driver.c
> >> @@ -1277,10 +1277,14 @@ static int pci_pm_runtime_resume(struct device *dev)
> >>
> >>       /*
> >>        * If pci_dev->driver is not set (unbound), the device should
> >> -      * always remain in D0 regardless of the runtime PM status
> >> +      * always remain in D0 regardless of the runtime PM status.
> >> +      * But if its parent can go to D3cold, this device may have
> >> +      * been in D3cold as well and require restoration of its BARs.
> >>        */
> >> -     if (!pci_dev->driver)
> >> +     if (!pci_dev->driver) {
> >> +             pci_restore_bars(pci_dev);
> >
> > If we do decide not to do a full-blown restore, how do we decide
> > whether to use pci_restore_bars() or pci_restore_config_space()?
> >
> > I'm not sure why we have both.
> 
> Me neither.
> 
> > The pci_restore_bars() path looks a
> > little smarter in that it is more careful when updating 64-bit BARs
> > that can't be updated atomically.
> >
> >>               return 0;
> >> +     }
> >>
> >>       if (!pm || !pm->runtime_resume)
> >>               return -ENOSYS;
> 
> So if pci_pm_runtime_suspend() is modified to call pci_save_state()
> before returning 0 in the !dev->driver case, we can just move the
> pci_restore_standard_config() invocation in pci_pm_runtime_resume() up
> to the very top and check dev->driver later.

I mean something like the patch below, overall (untested).

Tentatively-signed-off-by: Rafael J. Wysocki <rafael.j.wysocki at intel.com>
---
 drivers/pci/pci-driver.c |   16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/pci/pci-driver.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-driver.c
+++ linux-pm/drivers/pci/pci-driver.c
@@ -1224,11 +1224,14 @@ static int pci_pm_runtime_suspend(struct
 	int error;
 
 	/*
-	 * If pci_dev->driver is not set (unbound), the device should
-	 * always remain in D0 regardless of the runtime PM status
+	 * If pci_dev->driver is not set (unbound), the device may go to D3cold,
+	 * but only if the bridge above it is suspended.  In case that happens,
+	 * save its config space.
 	 */
-	if (!pci_dev->driver)
+	if (!pci_dev->driver) {
+		pci_save_state(pci_dev);
 		return 0;
+	}
 
 	if (!pm || !pm->runtime_suspend)
 		return -ENOSYS;
@@ -1276,16 +1279,17 @@ static int pci_pm_runtime_resume(struct
 	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
 
 	/*
-	 * If pci_dev->driver is not set (unbound), the device should
-	 * always remain in D0 regardless of the runtime PM status
+	 * Regardless of whether or not the driver is there, the device might
+	 * have been put into D3cold as a result of suspending the bridge above
+	 * it, so restore the config spaces of all devices here.
 	 */
+	pci_restore_standard_config(pci_dev);
 	if (!pci_dev->driver)
 		return 0;
 
 	if (!pm || !pm->runtime_resume)
 		return -ENOSYS;
 
-	pci_restore_standard_config(pci_dev);
 	pci_fixup_device(pci_fixup_resume_early, pci_dev);
 	pci_enable_wake(pci_dev, PCI_D0, false);
 	pci_fixup_device(pci_fixup_resume, pci_dev);



More information about the dri-devel mailing list