[PATCH 1/2] drm/amdgpu: fix power state when port pm is unavailable

Peter Wu peter at lekensteyn.nl
Tue Nov 29 00:58:06 UTC 2016


On Mon, Nov 28, 2016 at 05:28:00PM -0500, Alex Deucher wrote:
> On Mon, Nov 28, 2016 at 12:48 AM, Mike Lothian <mike at fireburn.co.uk> wrote:
> > The attached patch allows my machine to boot also
> >
> > Thanks
> >
> > Mike
> >
> > On Sat, 26 Nov 2016 at 14:14 Peter Wu <peter at lekensteyn.nl> wrote:
> >>
> >> Found the problem.
> >>
> >> amdgpu_atpx_pci_probe_handle is used to detect the ATPX handle, but this
> >> does not necessarily have to be on the dGPU. In fact, on your machine it
> >> is located on the Intel iGPU (00:02.0):
> >>
> >> [  134.760509] vga_switcheroo: detected switching method
> >> \_SB_.PCI0.GFX0.ATPX handle
> >>
> >> The parent device of the iGPU (00:00.0) is the root port which has no
> >> PCIe capability, hence bridge_d3 evaluated to false.
> >>
> >> Can you try the attached patch?
> 
> Looks good to me.  Patch for radeon included as well.
> 
> Alex

Thanks, the port looks good to me. (If still possible, I think you can
change Reported-by also to Reported-and-tested-by for the amdgpu patch.)

Kind regards,
Peter

> >>
> >> Kind regards,
> >> Peter
> >>
> >> On Sat, Nov 26, 2016 at 01:49:08PM +0000, Mike Lothian wrote:
> >> > I can confirm the patch allows my machine to boot again
> >> >
> >> > On Sat, 26 Nov 2016 at 11:39 Peter Wu <peter at lekensteyn.nl> wrote:
> >> >
> >> > > Hi Mike,
> >> > >
> >> > > To grab console output, you could also try to blacklist amdgpu at
> >> > > startup and load it later.
> >> > >
> >> > > Can you provide your acpidump (and lspci -nn, especially if you do not
> >> > > have a dmesg).
> >> > >
> >> > > This is the call chain:
> >> > >
> >> > > + amdgpu_atpx_detect
> >> > >   + amdgpu_atpx_pci_probe_handle
> >> > >     + return if ATPX ACPI handle cannot be found
> >> > >     + set amdgpu_atpx_priv.bridge_pm_usable
> >> > >   + amdgpu_atpx_init
> >> > >     + amdgpu_atpx_validate
> >> > >       + use amdgpu_atpx_priv.bridge_pm_usable
> >> > >
> >> > > I wonder whether newer devices remove ATPX completely. If that is the
> >> > > case (please provide acpidump!), can you try this hack:
> >> > >
> >> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > index 951addf..5c23382 100644
> >> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > @@ -487,6 +487,7 @@ static bool amdgpu_atpx_pci_probe_handle(struct
> >> > > pci_dev *pdev)
> >> > >         status = acpi_get_handle(dhandle, "ATPX", &atpx_handle);
> >> > >         if (ACPI_FAILURE(status)) {
> >> > >                 amdgpu_atpx_priv.other_handle = dhandle;
> >> > > +               amdgpu_atpx_priv.bridge_pm_usable = true;
> >> > >                 return false;
> >> > >         }
> >> > >         amdgpu_atpx_priv.dhandle = dhandle;
> >> > >
> >> > > This is not intended for merging, but just testing a hypothesis.
> >> > >
> >> > > Kind regards,
> >> > > Peter
> >> > >
> >> > > On Sat, Nov 26, 2016 at 11:20:28AM +0000, Mike Lothian wrote:
> >> > > > He's some screen shots of the booting screens, apologies they're not
> >> > > > that
> >> > > > clear
> >> > > >
> >> > > > https://imgur.com/a/V7mRA
> >> > > >
> >> > > > On Sat, 26 Nov 2016 at 11:00 Mike Lothian <mike at fireburn.co.uk>
> >> > > > wrote:
> >> > > >
> >> > > > > Hi
> >> > > > >
> >> > > > > The previous working kernel, was the patch before this - reverting
> >> > > fixes it
> >> > > > >
> >> > > > > My bios date is quite new 08/05/2016
> >> > > > >
> >> > > > > Most of the messages fly straight off the screen, the journal
> >> > > > > isn't
> >> > > kept
> >> > > > > and the wifi driver also complains, will see if I can get a slow
> >> > > > > motion
> >> > > > > capture of the errors and screen shot them
> >> > > > >
> >> > > > > Regards
> >> > > > >
> >> > > > > Mike
> >> > > > >
> >> > > > > On Sat, 26 Nov 2016 at 10:20 Peter Wu <peter at lekensteyn.nl> wrote:
> >> > > > >
> >> > > > > Hi Mike,
> >> > > > >
> >> > > > > On Sat, Nov 26, 2016 at 12:36:27AM +0000, Mike Lothian wrote:
> >> > > > > > This patch is preventing my laptop from booting, I'm getting D3
> >> > > > > > error
> >> > > > > > messages and atom bios stuck messages
> >> > > > >
> >> > > > > Can you post a dmesg and acpidump? What was the previous working
> >> > > kernel?
> >> > > > > I suppose your BIOS date (/sys/class/dmi/id/bios_date) is
> >> > > > > pre-2015?
> >> > > > > Does booting with runpm=0 work?
> >> > > > >
> >> > > > > > I mistakenly saw the v2 patch and didn't realise it was for
> >> > > > > > radeon
> >> > > not
> >> > > > > > amdgpu - this branch and Linus's tree are currently not booting
> >> > > > > > for
> >> > > me
> >> > > > >
> >> > > > > What do you mean by this?
> >> > > > >
> >> > > > > Kind regards,
> >> > > > > Peter
> >> > > > >
> >> > > > > > On Wed, 23 Nov 2016 at 17:16 Deucher, Alexander <
> >> > > > > Alexander.Deucher at amd.com>
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > > -----Original Message-----
> >> > > > > > > > From: Peter Wu [mailto:peter at lekensteyn.nl]
> >> > > > > > > > Sent: Wednesday, November 23, 2016 12:13 PM
> >> > > > > > > > To: Alex Deucher
> >> > > > > > > > Cc: amd-gfx list; Deucher, Alexander; Nayan Deshmukh
> >> > > > > > > > Subject: Re: [PATCH 1/2] drm/amdgpu: fix power state when
> >> > > > > > > > port
> >> > > pm is
> >> > > > > > > > unavailable
> >> > > > > > > >
> >> > > > > > > > On Wed, Nov 23, 2016 at 12:01:55PM -0500, Alex Deucher
> >> > > > > > > > wrote:
> >> > > > > > > > >  On Tue, Nov 22, 2016 at 8:22 PM, Peter Wu <
> >> > > peter at lekensteyn.nl>
> >> > > > > > > wrote:
> >> > > > > > > > > > When PCIe port PM is not enabled (system BIOS is
> >> > > > > > > > > > pre-2015 or
> >> > > the
> >> > > > > > > > > > pcie_port_pm=off parameter is set), legacy ATPX PM
> >> > > > > > > > > > should
> >> > > still
> >> > > > > be
> >> > > > > > > > > > marked as supported. Otherwise the GPU can fail to power
> >> > > > > > > > > > on
> >> > > after
> >> > > > > > > > > > runtime suspend. This affected a Dell Inspiron 5548.
> >> > > > > > > > > >
> >> > > > > > > > > > Ideally the BIOS date in the PCI core is lowered to 2013
> >> > > > > > > > > > (the
> >> > > > > first
> >> > > > > > > year
> >> > > > > > > > > > where hybrid graphics platforms using power resources
> >> > > > > > > > > > was
> >> > > > > > > introduced),
> >> > > > > > > > > > but that seems more risky at this point and would not
> >> > > > > > > > > > solve
> >> > > the
> >> > > > > > > > > > pcie_port_pm=off issue.
> >> > > > > > > > > >
> >> > > > > > > > > > Bugzilla:
> >> > > > > > > > > > https://bugs.freedesktop.org/show_bug.cgi?id=98505
> >> > > > > > > > > > Reported-and-tested-by: Nayan Deshmukh
> >> > > > > > > > <nayan26deshmukh at gmail.com>
> >> > > > > > > > > > Signed-off-by: Peter Wu <peter at lekensteyn.nl>
> >> > > > > > > > > > ---
> >> > > > > > > > > > Hi,
> >> > > > > > > > > >
> >> > > > > > > > > > This patch is already three weeks old. One alternative
> >> > > > > > > > > > idea
> >> > > was
> >> > > > > > > lowering
> >> > > > > > > > BIOS
> >> > > > > > > > > > date in PCI core, but as pcie_port_pm=force did not have
> >> > > > > > > > > > the
> >> > > > > desired
> >> > > > > > > > effect, I
> >> > > > > > > > > > do not think that this would help though.
> >> > > > > > > > >
> >> > > > > > > > > Thanks for doing this.
> >> > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > I have also not contacted linux-pci or Mika about
> >> > > > > > > > > > lowering
> >> > > the
> >> > > > > year
> >> > > > > > > due to
> >> > > > > > > > the
> >> > > > > > > > > > lack of a good reason.  Might do it later though once
> >> > > > > > > > > > ACPICA
> >> > > bug
> >> > > > > > > 1333 is
> >> > > > > > > > sorted
> >> > > > > > > > > > out such that lowering the year actually has benefits
> >> > > > > > > > > > for a
> >> > > > > Nvidia
> >> > > > > > > laptop
> >> > > > > > > > (or if
> >> > > > > > > > > > some amdgpu problem can be solved by this).
> >> > > > > > > > > >
> >> > > > > > > > > > Both patches should probably be Cc stable (4.8+), fixing
> >> > > > > > > 2f5af82eeab2 and
> >> > > > > > > > > > b8c9fd5ad4b4 ("track whether if this is a hybrid
> >> > > > > > > > > > graphics
> >> > > > > > > platform"). There
> >> > > > > > > > have
> >> > > > > > > > > > been some ifdef 0's and reverts in between, so I was not
> >> > > sure if
> >> > > > > > > adding
> >> > > > > > > > the
> >> > > > > > > > > > Fixes tag is appropriate.
> >> > > > > > > > >
> >> > > > > > > > > I don't think we need to cc stable.  In kernel 4.8 we
> >> > > > > > > > > don't
> >> > > > > attempt to
> >> > > > > > > > > do d3cold at all.  4.8 and older kernels have:
> >> > > > > > > > > c63695cc5e5f685e924e25a8f9555f6e846f1fc6 (drm/amdgpu: work
> >> > > around
> >> > > > > > > > lack
> >> > > > > > > > > of upstream ACPI support for D3cold)
> >> > > > > > > > > which was reverted for 4.9.
> >> > > > > > > >
> >> > > > > > > > That patch was already reverted in 4.8 as far as I can see:
> >> > > > > > > >
> >> > > > > > > > $ git tag --contains
> >> > > > > > > > c39b487f195b93235ee76384427467786f7bf29f |
> >> > > grep
> >> > > > > v4.8
> >> > > > > > > > v4.8
> >> > > > > > > >
> >> > > > > > > > Can you double-check? v4.8 was the first kernel with D3cold
> >> > > support
> >> > > > > in
> >> > > > > > > > PCI core.
> >> > > > > > >
> >> > > > > > > You are right.  I was thinking d3cold went upstream in 4.9, so
> >> > > yes, we
> >> > > > > > > should apply this to 4.8.
> >> > > > > > >
> >> > > > > > > Alex
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > > > > Series is:
> >> > > > > > > > > Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
> >> > > > > > > >
> >> > > > > > > > Thanks!
> >> > > > > > > >
> >> > > > > > > > Kind regards,
> >> > > > > > > > Peter
> >> > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > Kind regards,
> >> > > > > > > > > > Peter
> >> > > > > > > > > > ---
> >> > > > > > > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c | 9
> >> > > ++++++++-
> >> > > > > > > > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> >> > > > > > > > > >
> >> > > > > > > > > > diff --git
> >> > > > > > > > > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > > > > > > > > index 10b5ddf..951addf 100644
> >> > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > > > > > > > > @@ -33,6 +33,7 @@ struct amdgpu_atpx {
> >> > > > > > > > > >
> >> > > > > > > > > >  static struct amdgpu_atpx_priv {
> >> > > > > > > > > >         bool atpx_detected;
> >> > > > > > > > > > +       bool bridge_pm_usable;
> >> > > > > > > > > >         /* handle for device - and atpx */
> >> > > > > > > > > >         acpi_handle dhandle;
> >> > > > > > > > > >         acpi_handle other_handle;
> >> > > > > > > > > > @@ -200,7 +201,11 @@ static int
> >> > > > > > > > > > amdgpu_atpx_validate(struct
> >> > > > > > > > amdgpu_atpx *atpx)
> >> > > > > > > > > >         atpx->is_hybrid = false;
> >> > > > > > > > > >         if (valid_bits & ATPX_MS_HYBRID_GFX_SUPPORTED) {
> >> > > > > > > > > >                 printk("ATPX Hybrid Graphics\n");
> >> > > > > > > > > > -               atpx->functions.power_cntl = false;
> >> > > > > > > > > > +               /*
> >> > > > > > > > > > +                * Disable legacy PM methods only when
> >> > > > > > > > > > pcie
> >> > > port
> >> > > > > PM
> >> > > > > > > is usable,
> >> > > > > > > > > > +                * otherwise the device might fail to
> >> > > > > > > > > > power
> >> > > off
> >> > > > > or
> >> > > > > > > power on.
> >> > > > > > > > > > +                */
> >> > > > > > > > > > +               atpx->functions.power_cntl =
> >> > > > > > > > !amdgpu_atpx_priv.bridge_pm_usable;
> >> > > > > > > > > >                 atpx->is_hybrid = true;
> >> > > > > > > > > >         }
> >> > > > > > > > > >
> >> > > > > > > > > > @@ -471,6 +476,7 @@ static int
> >> > > > > > > > > > amdgpu_atpx_power_state(enum
> >> > > > > > > > vga_switcheroo_client_id id,
> >> > > > > > > > > >   */
> >> > > > > > > > > >  static bool amdgpu_atpx_pci_probe_handle(struct pci_dev
> >> > > *pdev)
> >> > > > > > > > > >  {
> >> > > > > > > > > > +       struct pci_dev *parent_pdev =
> >> > > pci_upstream_bridge(pdev);
> >> > > > > > > > > >         acpi_handle dhandle, atpx_handle;
> >> > > > > > > > > >         acpi_status status;
> >> > > > > > > > > >
> >> > > > > > > > > > @@ -485,6 +491,7 @@ static bool
> >> > > > > > > > amdgpu_atpx_pci_probe_handle(struct pci_dev *pdev)
> >> > > > > > > > > >         }
> >> > > > > > > > > >         amdgpu_atpx_priv.dhandle = dhandle;
> >> > > > > > > > > >         amdgpu_atpx_priv.atpx.handle = atpx_handle;
> >> > > > > > > > > > +       amdgpu_atpx_priv.bridge_pm_usable = parent_pdev
> >> > > > > > > > > > &&
> >> > > > > > > > parent_pdev->bridge_d3;
> >> > > > > > > > > >         return true;
> >> > > > > > > > > >  }
> >> > > > > > > > > >
> >> > > > > > > > > > --
> >> > > > > > > > > > 2.10.2

> From cd21b5055cca49b30b0caaf1107a9aaeb60a447f Mon Sep 17 00:00:00 2001
> From: Alex Deucher <alexander.deucher at amd.com>
> Date: Mon, 28 Nov 2016 17:23:40 -0500
> Subject: [PATCH] drm/radeon: fix check for port PM availability
> 
> The ATPX method does not always exist on the dGPU, it may be located at
> the iGPU. The parent device of the iGPU is the root port for which
> bridge_d3 is false. This accidentally enables the legacy PM method which
> conflicts with port PM and prevented the dGPU from powering on.
> 
> Ported from amdgpu commit:
> drm/amdgpu: fix check for port PM availability
> from Peter Wu.
> 
> Fixes: d3ac31f3b4bf9fad (drm/radeon: fix power state when port pm is unavailable (v2))
> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
> Cc: Peter Wu <peter at lekensteyn.nl>
> Cc: <stable at vger.kernel.org> # 4.8+
> ---
>  drivers/gpu/drm/radeon/radeon_atpx_handler.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_atpx_handler.c b/drivers/gpu/drm/radeon/radeon_atpx_handler.c
> index 4129b12..0ae13cd2 100644
> --- a/drivers/gpu/drm/radeon/radeon_atpx_handler.c
> +++ b/drivers/gpu/drm/radeon/radeon_atpx_handler.c
> @@ -479,7 +479,6 @@ static int radeon_atpx_power_state(enum vga_switcheroo_client_id id,
>   */
>  static bool radeon_atpx_pci_probe_handle(struct pci_dev *pdev)
>  {
> -	struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
>  	acpi_handle dhandle, atpx_handle;
>  	acpi_status status;
>  
> @@ -493,7 +492,6 @@ static bool radeon_atpx_pci_probe_handle(struct pci_dev *pdev)
>  
>  	radeon_atpx_priv.dhandle = dhandle;
>  	radeon_atpx_priv.atpx.handle = atpx_handle;
> -	radeon_atpx_priv.bridge_pm_usable = parent_pdev && parent_pdev->bridge_d3;
>  	return true;
>  }
>  
> @@ -555,11 +553,16 @@ static bool radeon_atpx_detect(void)
>  	struct pci_dev *pdev = NULL;
>  	bool has_atpx = false;
>  	int vga_count = 0;
> +	bool d3_supported = false;
> +	struct pci_dev *parent_pdev;
>  
>  	while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != NULL) {
>  		vga_count++;
>  
>  		has_atpx |= (radeon_atpx_pci_probe_handle(pdev) == true);
> +
> +		parent_pdev = pci_upstream_bridge(pdev);
> +		d3_supported |= parent_pdev && parent_pdev->bridge_d3;
>  	}
>  
>  	/* some newer PX laptops mark the dGPU as a non-VGA display device */
> @@ -567,6 +570,9 @@ static bool radeon_atpx_detect(void)
>  		vga_count++;
>  
>  		has_atpx |= (radeon_atpx_pci_probe_handle(pdev) == true);
> +
> +		parent_pdev = pci_upstream_bridge(pdev);
> +		d3_supported |= parent_pdev && parent_pdev->bridge_d3;
>  	}
>  
>  	if (has_atpx && vga_count == 2) {
> @@ -574,6 +580,7 @@ static bool radeon_atpx_detect(void)
>  		printk(KERN_INFO "vga_switcheroo: detected switching method %s handle\n",
>  		       acpi_method_name);
>  		radeon_atpx_priv.atpx_detected = true;
> +		radeon_atpx_priv.bridge_pm_usable = d3_supported;
>  		radeon_atpx_init();
>  		return true;
>  	}
> -- 
> 2.5.5
> 


More information about the amd-gfx mailing list