[PATCH 1/2] drm/amdgpu: fix power state when port pm is unavailable

Peter Wu peter at lekensteyn.nl
Sat Nov 26 14:14:07 UTC 2016


Found the problem.

amdgpu_atpx_pci_probe_handle is used to detect the ATPX handle, but this
does not necessarily have to be on the dGPU. In fact, on your machine it
is located on the Intel iGPU (00:02.0):

[  134.760509] vga_switcheroo: detected switching method \_SB_.PCI0.GFX0.ATPX handle

The parent device of the iGPU (00:00.0) is the root port which has no
PCIe capability, hence bridge_d3 evaluated to false.

Can you try the attached patch?

Kind regards,
Peter

On Sat, Nov 26, 2016 at 01:49:08PM +0000, Mike Lothian wrote:
> I can confirm the patch allows my machine to boot again
> 
> On Sat, 26 Nov 2016 at 11:39 Peter Wu <peter at lekensteyn.nl> wrote:
> 
> > Hi Mike,
> >
> > To grab console output, you could also try to blacklist amdgpu at
> > startup and load it later.
> >
> > Can you provide your acpidump (and lspci -nn, especially if you do not
> > have a dmesg).
> >
> > This is the call chain:
> >
> > + amdgpu_atpx_detect
> >   + amdgpu_atpx_pci_probe_handle
> >     + return if ATPX ACPI handle cannot be found
> >     + set amdgpu_atpx_priv.bridge_pm_usable
> >   + amdgpu_atpx_init
> >     + amdgpu_atpx_validate
> >       + use amdgpu_atpx_priv.bridge_pm_usable
> >
> > I wonder whether newer devices remove ATPX completely. If that is the
> > case (please provide acpidump!), can you try this hack:
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> > index 951addf..5c23382 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> > @@ -487,6 +487,7 @@ static bool amdgpu_atpx_pci_probe_handle(struct
> > pci_dev *pdev)
> >         status = acpi_get_handle(dhandle, "ATPX", &atpx_handle);
> >         if (ACPI_FAILURE(status)) {
> >                 amdgpu_atpx_priv.other_handle = dhandle;
> > +               amdgpu_atpx_priv.bridge_pm_usable = true;
> >                 return false;
> >         }
> >         amdgpu_atpx_priv.dhandle = dhandle;
> >
> > This is not intended for merging, but just testing a hypothesis.
> >
> > Kind regards,
> > Peter
> >
> > On Sat, Nov 26, 2016 at 11:20:28AM +0000, Mike Lothian wrote:
> > > He's some screen shots of the booting screens, apologies they're not that
> > > clear
> > >
> > > https://imgur.com/a/V7mRA
> > >
> > > On Sat, 26 Nov 2016 at 11:00 Mike Lothian <mike at fireburn.co.uk> wrote:
> > >
> > > > Hi
> > > >
> > > > The previous working kernel, was the patch before this - reverting
> > fixes it
> > > >
> > > > My bios date is quite new 08/05/2016
> > > >
> > > > Most of the messages fly straight off the screen, the journal isn't
> > kept
> > > > and the wifi driver also complains, will see if I can get a slow motion
> > > > capture of the errors and screen shot them
> > > >
> > > > Regards
> > > >
> > > > Mike
> > > >
> > > > On Sat, 26 Nov 2016 at 10:20 Peter Wu <peter at lekensteyn.nl> wrote:
> > > >
> > > > Hi Mike,
> > > >
> > > > On Sat, Nov 26, 2016 at 12:36:27AM +0000, Mike Lothian wrote:
> > > > > This patch is preventing my laptop from booting, I'm getting D3 error
> > > > > messages and atom bios stuck messages
> > > >
> > > > Can you post a dmesg and acpidump? What was the previous working
> > kernel?
> > > > I suppose your BIOS date (/sys/class/dmi/id/bios_date) is pre-2015?
> > > > Does booting with runpm=0 work?
> > > >
> > > > > I mistakenly saw the v2 patch and didn't realise it was for radeon
> > not
> > > > > amdgpu - this branch and Linus's tree are currently not booting for
> > me
> > > >
> > > > What do you mean by this?
> > > >
> > > > Kind regards,
> > > > Peter
> > > >
> > > > > On Wed, 23 Nov 2016 at 17:16 Deucher, Alexander <
> > > > Alexander.Deucher at amd.com>
> > > > > wrote:
> > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Peter Wu [mailto:peter at lekensteyn.nl]
> > > > > > > Sent: Wednesday, November 23, 2016 12:13 PM
> > > > > > > To: Alex Deucher
> > > > > > > Cc: amd-gfx list; Deucher, Alexander; Nayan Deshmukh
> > > > > > > Subject: Re: [PATCH 1/2] drm/amdgpu: fix power state when port
> > pm is
> > > > > > > unavailable
> > > > > > >
> > > > > > > On Wed, Nov 23, 2016 at 12:01:55PM -0500, Alex Deucher wrote:
> > > > > > > >  On Tue, Nov 22, 2016 at 8:22 PM, Peter Wu <
> > peter at lekensteyn.nl>
> > > > > > wrote:
> > > > > > > > > When PCIe port PM is not enabled (system BIOS is pre-2015 or
> > the
> > > > > > > > > pcie_port_pm=off parameter is set), legacy ATPX PM should
> > still
> > > > be
> > > > > > > > > marked as supported. Otherwise the GPU can fail to power on
> > after
> > > > > > > > > runtime suspend. This affected a Dell Inspiron 5548.
> > > > > > > > >
> > > > > > > > > Ideally the BIOS date in the PCI core is lowered to 2013 (the
> > > > first
> > > > > > year
> > > > > > > > > where hybrid graphics platforms using power resources was
> > > > > > introduced),
> > > > > > > > > but that seems more risky at this point and would not solve
> > the
> > > > > > > > > pcie_port_pm=off issue.
> > > > > > > > >
> > > > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98505
> > > > > > > > > Reported-and-tested-by: Nayan Deshmukh
> > > > > > > <nayan26deshmukh at gmail.com>
> > > > > > > > > Signed-off-by: Peter Wu <peter at lekensteyn.nl>
> > > > > > > > > ---
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > This patch is already three weeks old. One alternative idea
> > was
> > > > > > lowering
> > > > > > > BIOS
> > > > > > > > > date in PCI core, but as pcie_port_pm=force did not have the
> > > > desired
> > > > > > > effect, I
> > > > > > > > > do not think that this would help though.
> > > > > > > >
> > > > > > > > Thanks for doing this.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > I have also not contacted linux-pci or Mika about lowering
> > the
> > > > year
> > > > > > due to
> > > > > > > the
> > > > > > > > > lack of a good reason.  Might do it later though once ACPICA
> > bug
> > > > > > 1333 is
> > > > > > > sorted
> > > > > > > > > out such that lowering the year actually has benefits for a
> > > > Nvidia
> > > > > > laptop
> > > > > > > (or if
> > > > > > > > > some amdgpu problem can be solved by this).
> > > > > > > > >
> > > > > > > > > Both patches should probably be Cc stable (4.8+), fixing
> > > > > > 2f5af82eeab2 and
> > > > > > > > > b8c9fd5ad4b4 ("track whether if this is a hybrid graphics
> > > > > > platform"). There
> > > > > > > have
> > > > > > > > > been some ifdef 0's and reverts in between, so I was not
> > sure if
> > > > > > adding
> > > > > > > the
> > > > > > > > > Fixes tag is appropriate.
> > > > > > > >
> > > > > > > > I don't think we need to cc stable.  In kernel 4.8 we don't
> > > > attempt to
> > > > > > > > do d3cold at all.  4.8 and older kernels have:
> > > > > > > > c63695cc5e5f685e924e25a8f9555f6e846f1fc6 (drm/amdgpu: work
> > around
> > > > > > > lack
> > > > > > > > of upstream ACPI support for D3cold)
> > > > > > > > which was reverted for 4.9.
> > > > > > >
> > > > > > > That patch was already reverted in 4.8 as far as I can see:
> > > > > > >
> > > > > > > $ git tag --contains c39b487f195b93235ee76384427467786f7bf29f |
> > grep
> > > > v4.8
> > > > > > > v4.8
> > > > > > >
> > > > > > > Can you double-check? v4.8 was the first kernel with D3cold
> > support
> > > > in
> > > > > > > PCI core.
> > > > > >
> > > > > > You are right.  I was thinking d3cold went upstream in 4.9, so
> > yes, we
> > > > > > should apply this to 4.8.
> > > > > >
> > > > > > Alex
> > > > > >
> > > > > > >
> > > > > > > > Series is:
> > > > > > > > Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > Kind regards,
> > > > > > > Peter
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Kind regards,
> > > > > > > > > Peter
> > > > > > > > > ---
> > > > > > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c | 9
> > ++++++++-
> > > > > > > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> > > > > > > > > index 10b5ddf..951addf 100644
> > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> > > > > > > > > @@ -33,6 +33,7 @@ struct amdgpu_atpx {
> > > > > > > > >
> > > > > > > > >  static struct amdgpu_atpx_priv {
> > > > > > > > >         bool atpx_detected;
> > > > > > > > > +       bool bridge_pm_usable;
> > > > > > > > >         /* handle for device - and atpx */
> > > > > > > > >         acpi_handle dhandle;
> > > > > > > > >         acpi_handle other_handle;
> > > > > > > > > @@ -200,7 +201,11 @@ static int amdgpu_atpx_validate(struct
> > > > > > > amdgpu_atpx *atpx)
> > > > > > > > >         atpx->is_hybrid = false;
> > > > > > > > >         if (valid_bits & ATPX_MS_HYBRID_GFX_SUPPORTED) {
> > > > > > > > >                 printk("ATPX Hybrid Graphics\n");
> > > > > > > > > -               atpx->functions.power_cntl = false;
> > > > > > > > > +               /*
> > > > > > > > > +                * Disable legacy PM methods only when pcie
> > port
> > > > PM
> > > > > > is usable,
> > > > > > > > > +                * otherwise the device might fail to power
> > off
> > > > or
> > > > > > power on.
> > > > > > > > > +                */
> > > > > > > > > +               atpx->functions.power_cntl =
> > > > > > > !amdgpu_atpx_priv.bridge_pm_usable;
> > > > > > > > >                 atpx->is_hybrid = true;
> > > > > > > > >         }
> > > > > > > > >
> > > > > > > > > @@ -471,6 +476,7 @@ static int amdgpu_atpx_power_state(enum
> > > > > > > vga_switcheroo_client_id id,
> > > > > > > > >   */
> > > > > > > > >  static bool amdgpu_atpx_pci_probe_handle(struct pci_dev
> > *pdev)
> > > > > > > > >  {
> > > > > > > > > +       struct pci_dev *parent_pdev =
> > pci_upstream_bridge(pdev);
> > > > > > > > >         acpi_handle dhandle, atpx_handle;
> > > > > > > > >         acpi_status status;
> > > > > > > > >
> > > > > > > > > @@ -485,6 +491,7 @@ static bool
> > > > > > > amdgpu_atpx_pci_probe_handle(struct pci_dev *pdev)
> > > > > > > > >         }
> > > > > > > > >         amdgpu_atpx_priv.dhandle = dhandle;
> > > > > > > > >         amdgpu_atpx_priv.atpx.handle = atpx_handle;
> > > > > > > > > +       amdgpu_atpx_priv.bridge_pm_usable = parent_pdev &&
> > > > > > > parent_pdev->bridge_d3;
> > > > > > > > >         return true;
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.10.2
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-drm-amdgpu-fix-check-for-port-PM-availability.patch
Type: text/x-diff
Size: 2666 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20161126/75b944ad/attachment.patch>


More information about the amd-gfx mailing list