[6.13.6 stable regression?] Nouveau reboot failure in r535_gsp_msg_recv()
Timur Tabi
ttabi at nvidia.com
Mon Apr 7 17:14:35 UTC 2025
On Mon, 2025-04-07 at 18:01 +0100, David Woodhouse wrote:
> >
>
> Not exactly the same build (I'm on 6.14 now) but:
>
> (gdb) list *(r535_gsp_msgq_wait+0x1c4)
> 0xd24 is in r535_gsp_msgq_wait
> (drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:117).
> 112 break;
> 113
> 114 usleep_range(1, 2);
> 115 } while (--(*ptime));
> 116
> 117 if (WARN_ON(!*ptime))
> 118 return ERR_PTR(-ETIMEDOUT);
This tells me that GSP-RM has crashed, which explains a lot of the behavior
you're seeing.
What I need now are the GSP-RM logs. In your /etc/modprobe.d, see if there
is a file with "options nouveau". If there isn't, create one, and then add
the "keep-gsp-logging=1" parameter, so it looks something like this:
options nouveau keep-gsp-logging=1
Reboot and then tell me if you see anything like this:
# ls -lR /sys/kernel/debug/nouveau/
/sys/kernel/debug/nouveau/:
total 0
drwxr-xr-x 2 root root 0 Apr 7 12:06 0000:65:00.0
'/sys/kernel/debug/nouveau/0000:65:00.0':
total 0
-r--r--r-- 1 root root 65536 Apr 7 12:06 loginit
-r--r--r-- 1 root root 65536 Apr 7 12:06 logintr
-r--r--r-- 1 root root 4096 Apr 7 12:06 logpmu
-r--r--r-- 1 root root 65536 Apr 7 12:06 logrm
If you do, I need the contents of these files. So e.g.:
cp /sys/kernel/debug/nouveau/0000:65:00.0/loginit loginit
cp /sys/kernel/debug/nouveau/0000:65:00.0/logrm logrm
cp /sys/kernel/debug/nouveau/0000:65:00.0/logpmu logpmu
cp /sys/kernel/debug/nouveau/0000:65:00.0/logintr logintr
You may only see some of these files, that's okay.
Zip them up and email them to me.
> Any clues on how to debug the USB-C output, and where to report that?
No, I can't help with that.
More information about the dri-devel
mailing list