4.20.0-rc3 nouveau/Quadro P2000 Mobile: runpm causing ACPI errors, lockups
Mika Westerberg
mika.westerberg at linux.intel.com
Wed Nov 28 15:55:44 UTC 2018
On Wed, Nov 28, 2018 at 10:09:22AM -0500, Michael S. Tsirkin wrote:
> Yea all this is weird, in particular I wonder why does everyone
> using dsm insists on saying Arg4
> when they actually mean Arg3. ACPI numbers arguments from 0.
>
> So it's a bit ugly, and maybe worth fixing but unlikely to be
> an actual issue simply because we end up not using DSM in the end.
I agree.
> Poking at the probing code in nouveau_pr3_present, I started to wonder:
> should I try to hack it to disable d3cold and pr3 and see what
> happens?
I guess it is worth a try. You can do it from sysfs for the graphics
PCI device there is an attribute d3cold_allowed that controls this.
[snip]
> > > 00:14.3 Network controller: Intel Corporation Wireless-AC 9560 [Jefferson Peak] (rev 10)
> > >
> > > so really shouldn't be affected, but go figure. If driver really is getting
> > > all-ones from the device, it just might try to poke at a wrong b:d.f by mistake
> > > maybe ...
> >
> > Or it the power resource is shared by wifi as well.
>
> Is there a way to find out through e.g. sysfs?
It is not shared, I checked from the acpidump you provided. Possibly the
infinite loop in AML when executing NVPO method have some effect on
this.
[snip]
> > No need to send, I can read it from the bugzilla just fine. Can you attach
> > acpidump there as well?
>
> Done. lspci -x too just in case.
Looking at the dmesg:
[ 52.917009] No Local Variables are initialized for Method [NVPO]
[ 52.917011] No Arguments are initialized for method [NVPO]
[ 52.917012] ACPI Error: Method parse/execution failed \_SB.PCI0.PEG0.PEGP.NVPO, AE_AML_LOOP_TIMEOUT (20181003/psparse-516)
[ 52.917063] ACPI Error: Method parse/execution failed \_SB.PCI0.PGON, AE_AML_LOOP_TIMEOUT (20181003/psparse-516)
[ 52.917084] ACPI Error: Method parse/execution failed \_SB.PCI0.PEG0.PG00._ON, AE_AML_LOOP_TIMEOUT (20181003/psparse-516)
So what happens here is that Linux turns off power resource
\_SB.PCI0.PEG0.PG00 by calling its _OFF method (happens when the root
port is runtime suspended). This ends up calling \_SB.PCI0.PGON which
calls \_SB.PCI0.PEG0.PEGP.NVPO.
The last method looks like this:
Method (NVPO, 0, NotSerialized)
{
While ((\_SB.PCI0.P0LS < 0x03))
{
Sleep (One)
}
So basically it polls P0LS register infinitely if the returned value is
less than 3. I suspect this is the issue and it then makes the other
like wifi to fail to execute its methods.
P0LS comes from this operation region:
OperationRegion (OPG0, SystemMemory, (XBAS + 0x8000), 0x1000)
Field (OPG0, AnyAcc, NoLock, Preserve)
{
...
Offset (0x216),
P0LS, 4,
This is some host bridge register but not sure which because XBAS value
cannot be determined from the acpidump.
More information about the dri-devel
mailing list