[Nouveau] Nouveau: kernel hang on Optimus+Intel+NVidia GeForce 1060m
Tobias Klausmann
tobias.johannes.klausmann at mni.thm.de
Wed Sep 13 12:28:29 UTC 2017
Hi,
the system fails to initialize your vbios using secureboot (i had a rare
chance to on my system to witness it again), for now i traced it to
acr_boot_falcon() in
"linux/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue_0148cdec.c" where it
throws -110 which is -ETIMEDOUT. You could try to increase the timeout
and see if it helps something, similar to the following:
diff --git a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
index 77273b53672c..fc0cb187d80d 100644
--- a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
+++ b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
@@ -326,7 +326,7 @@ nvkm_msgqueue_post(struct nvkm_msgqueue *priv, enum
msgqueue_msg_priority prio,
int ret;
if (wait_init && !wait_for_completion_timeout(&priv->init_done,
- msecs_to_jiffies(1000)))
+ msecs_to_jiffies(5000)))
return -ETIMEDOUT;
queue = priv->func->cmd_queue(priv, prio);
diff --git a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue_0137c63d.c
b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue_0137c63d.c
index fec0273158f6..c2ae525a0780 100644
--- a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue_0137c63d.c
+++ b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue_0137c63d.c
@@ -279,6 +279,7 @@ acr_boot_falcon(struct nvkm_msgqueue *priv, enum
nvkm_secboot_falcon falcon)
u32 flags;
u32 falcon_id;
} cmd;
+ const struct nvkm_subdev *subdev = priv->falcon->owner;
memset(&cmd, 0, sizeof(cmd));
@@ -290,7 +291,8 @@ acr_boot_falcon(struct nvkm_msgqueue *priv, enum
nvkm_secboot_falcon falcon)
nvkm_msgqueue_post(priv, MSGQUEUE_MSG_PRIORITY_HIGH, &cmd.hdr,
acr_boot_falcon_callback, &completed, true);
- if (!wait_for_completion_timeout(&completed,
msecs_to_jiffies(1000)))
+ nvkm_error(subdev, "waiting for timeout in acr_boot_falcon
(msgqueue_0137bca5)\n");
+ if (!wait_for_completion_timeout(&completed,
msecs_to_jiffies(5000)))
return -ETIMEDOUT;
return 0;
On 9/13/17 11:37 AM, Nicolas Mercier wrote:
> I am still looking for a solution. I have hacked around in the code
> and found out the following:
> - Nouveau prefers using PCIE power managemet over ACPI Optimus calls.
> I tried to force it to use Optimus ACPI calls, but there was an error
> calling the ACPI method so it bails out and uses PCIE PM anyway.
> - I tried to debug the PCIE pm states which internally uses ACPI to
> turn power on/off. I could print different statuses here and there.
> When the power is switched off, ACPI calls turn the power off then the
> kernel successfully puts the device in state D3Cold (also turning off
> power to the PCI Express port). When waking up, ACPI turns the power
> on, apparently successfully (Device [PEGP] transitioned to D0). But a
> read from the PCI bus to get the power state & other flags return
> 65535 (~0) and the kernel fails to set the device in D0 (although ACPI
> claims it is in D0)
> The call to pci_raw_set_power_state (in drivers/pci/pci.c) seems to
> fail because pci_read_config_word returns "~0" (and does not return
> any error code)
>
> I have tried different things; if I use pcie_port_pm=off, the NVidia
> card goes to state D3Hot (if I am not mistaken, its PCIE port is still
> powered) but that did not fix it. I tried to turn on or off different
> PCI/PCIexpress features such as hotplug, PM and so on. The only thing
> that works is that PM is fully disabled, which equals to the device
> not being powered off, so that would be equivalent to nouveau.runpm=0,
> which is not helping a lot. I have tried to force pcie aspm by
> recompiling the ACPI table, still no luck.
>
> I am still taking a look, but it seems like the problem comes from the
> PCIExpress PM functions and ACPI, not directly from Nouveau
>
> /n
More information about the Nouveau
mailing list