[6.13.6 stable regression?] Nouveau reboot failure in r535_gsp_msg_recv()

Timur Tabi ttabi at nvidia.com
Mon Apr 7 17:14:35 UTC 2025


On Mon, 2025-04-07 at 18:01 +0100, David Woodhouse wrote:

> > 
> 
> Not exactly the same build (I'm on 6.14 now) but:
> 
> (gdb) list *(r535_gsp_msgq_wait+0x1c4)
> 0xd24 is in r535_gsp_msgq_wait
> (drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:117).
> 112				break;
> 113	
> 114			usleep_range(1, 2);
> 115		} while (--(*ptime));
> 116	
> 117		if (WARN_ON(!*ptime))
> 118			return ERR_PTR(-ETIMEDOUT);

This tells me that GSP-RM has crashed, which explains a lot of the behavior
you're seeing.

What I need now are the GSP-RM logs.  In your /etc/modprobe.d, see if there
is a file with "options nouveau".  If there isn't, create one, and then add
the "keep-gsp-logging=1" parameter, so it looks something like this:

	options nouveau keep-gsp-logging=1

Reboot and then tell me if you see anything like this:

# ls -lR /sys/kernel/debug/nouveau/                                        
/sys/kernel/debug/nouveau/:                                                
total 0                                                                    
drwxr-xr-x 2 root root 0 Apr  7 12:06 0000:65:00.0                         
                                                                                                                                                                                                                                                                                                            
'/sys/kernel/debug/nouveau/0000:65:00.0':                                  
total 0                                                                    
-r--r--r-- 1 root root 65536 Apr  7 12:06 loginit                          
-r--r--r-- 1 root root 65536 Apr  7 12:06 logintr                          
-r--r--r-- 1 root root  4096 Apr  7 12:06 logpmu                           
-r--r--r-- 1 root root 65536 Apr  7 12:06 logrm                            

If you do, I need the contents of these files.  So e.g.:

cp /sys/kernel/debug/nouveau/0000:65:00.0/loginit loginit
cp /sys/kernel/debug/nouveau/0000:65:00.0/logrm logrm
cp /sys/kernel/debug/nouveau/0000:65:00.0/logpmu logpmu
cp /sys/kernel/debug/nouveau/0000:65:00.0/logintr logintr

You may only see some of these files, that's okay.

Zip them up and email them to me.


> Any clues on how to debug the USB-C output, and where to report that? 

No, I can't help with that.




More information about the dri-devel mailing list