[PATCH 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

Mika Westerberg mika.westerberg at linux.intel.com
Mon May 30 09:57:09 UTC 2016


+Rafael

On Fri, May 27, 2016 at 01:10:37PM +0200, Peter Wu wrote:
> On Wed, May 25, 2016 at 04:55:35PM +0300, Mika Westerberg wrote:
> > On Wed, May 25, 2016 at 12:53:01AM +0200, Peter Wu wrote:
> > > Since "PCI: Add runtime PM support for PCIe ports", the parent PCIe port
> > > can be runtime-suspended which disables power resources via ACPI. This
> > > is incompatible with DSM, resulting in a GPU device which is still in D3
> > > and locks up the kernel on resume.
> > > 
> > > Mirror the behavior of Windows 8 and newer[1] (as observed via an AMLi
> > > debugger trace) and stop using the DSM functions for D3cold when power
> > > resources are available on the parent PCIe port.
> > > 
> > >  [1]: https://msdn.microsoft.com/windows/hardware/drivers/bringup/firmware-requirements-for-d3cold
> > > 
> > > Signed-off-by: Peter Wu <peter at lekensteyn.nl>
> > > ---
> > >  drivers/gpu/drm/nouveau/nouveau_acpi.c | 34 ++++++++++++++++++++++++++++++----
> > >  1 file changed, 30 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > > index df9f73e..e469df7 100644
> > > --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > > +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > > @@ -46,6 +46,7 @@ static struct nouveau_dsm_priv {
> > >  	bool dsm_detected;
> > >  	bool optimus_detected;
> > >  	bool optimus_flags_detected;
> > > +	bool optimus_skip_dsm;
> > >  	acpi_handle dhandle;
> > >  	acpi_handle rom_handle;
> > >  } nouveau_dsm_priv;
> > > @@ -212,8 +213,26 @@ static const struct vga_switcheroo_handler nouveau_dsm_handler = {
> > >  	.get_client_id = nouveau_dsm_get_client_id,
> > >  };
> > >  
> > > +/* Firmware supporting Windows 8 or later do not use _DSM to put the device into
> > > + * D3cold, they instead rely on disabling power resources on the parent. */
> > > +static bool nouveau_pr3_present(struct pci_dev *pdev)
> > > +{
> > > +	struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
> > > +	struct acpi_device *ad;
> > 
> > Nit: please call this adev instead of ad.
> 
> Will do.
> 
> > > +
> > > +	if (!parent_pdev)
> > > +		return false;
> > > +
> > > +	ad = ACPI_COMPANION(&parent_pdev->dev);
> > > +	if (!ad)
> > > +		return false;
> > > +
> > > +	return ad->power.flags.power_resources;
> > 
> > Is this sufficient to tell if the parent device has _PR3? I thought it
> > returns true if it has power resources in general, not necessarily _PR3.
> > 
> > Otherwise this looks okay to me.
> 
> It is indeed set whenever there is any _PRx method. I wonder if it is
> appropriate to access fields directly like this, perhaps this would be
> more accurate (based on device_pm.c):
> 
>     /* Check whether the _PR3 method is available. */
>     return adev->power.states[ACPI_STATE_D3_COLD].flags.valid;
> 
> I am also considering adding a check in case the pcieport driver does
> not support D3cold via runtime PM, what do you think of this?
> 
>     if (!parent_pdev)
>         return false;
>     /* If the PCIe port does not support D3cold via runtime PM, allow a
>      * fallback to the Optimus DSM method to put the device in D3cold. */
>     if (parent_pdev->no_d3cold)
>         return false;
> 
> This is needed to avoid the regression reported in the cover letter, but
> also allows pre-2015 systems to (still) have the D3cold possibility.

The _DSM method with 0 as index parameter should return a bit field
telling which functions are supported. Sane BIOS disables that
particular function if it detects Windows 8 and newer. Have you checked
if that's the case?

Then you can call _DSM only if it is supported and otherwise expect the
parent device's power resources to turn off power when runtime
suspended.

> Out of curiosity I looked up an pre-2015 laptop (found Acer V5-573G,
> apparently from November 2013, Windows 8.1) and extracted the ACPI
> tables from the BIOS images. BIOS 2.28 (2014/05/13) introduces support
> for power resources on the parent devicea(\_SB.PCI0.PEG0._PR3 and a
> related NVP3 device) when _OSI("Windows 2013") is true. (This is added
> as alternative for the old DSM interface.)
> 
> Maybe 2014 is also an appropriate cutoff date? I wonder if it is
> feasible to detect firmware use of _OSI("Windows 2013") and use that
> instead of the BIOS year.

Using BIOS year works even if there is no ACPI available.

What comes to the cutoff date, I discussed with Rafael and it was
decided that we use the same year Windows 10 was released to be on the
safe side. Reading the links you provided here:

https://msdn.microsoft.com/fi-fi/windows/hardware/drivers/bringup/device-power-management
https://msdn.microsoft.com/en-us/library/windows/hardware/hh967709(v=vs.85).aspx

it seems that from Windows 8 they started transitioning devices into
D3cold during runtime as well.


More information about the dri-devel mailing list