[bug report] drm/amdgpu: amdgpu crash on playing videos, linux 6.10-rc

Wang Yunchen mac-wang at sjtu.edu.cn
Wed May 29 04:05:51 UTC 2024


Hello,

After upgrading to Linux 6.10-rc1 (Mesa is left untouched) I identified a strange bug that could cause the GPU to
crash and reset while playing videos online with VA-API. The screen would first start to jitter, then flicker once or
twice, but the desktop session couldn't be brought back. After a reboot I find the following messages in system logs:

10:13:05 kernel: gmc_v11_0_process_interrupt: 52 callbacks suppressed
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:10 kernel: gmc_v11_0_process_interrupt: 222971 callbacks suppressed
10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:15 kernel: gmc_v11_0_process_interrupt: 236783 callbacks suppressed
10:13:15 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
10:13:16 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_unified_0 timeout, signaled seq=5197, emitted
seq=5200
10:13:16 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process RDD Process pid 2857 thread
firefox:cs0 pid 2909
10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
10:13:16 kernel: [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
10:13:17 kernel: [drm] Register(0) [regUVD_RB_RPTR] failed to reach value 0x00000340 != 0x000002c0n
10:13:17 kernel: [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: MODE2 reset
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
10:13:17 kernel: [drm] PCIE GART of 512M enabled (table at 0x000000801FD00000).
10:13:17 kernel: [drm] VRAM is lost due to GPU reset!
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
10:13:17 kernel: [drm] DMUB hardware initialized: version=0x08003A00
10:13:17 kernel: [drm] kiq ring mec 3 pipe 1 q 0
10:13:17 kernel: amdgpu 0000:03:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow start
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow done
10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset(1) succeeded!
10:13:17 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!

A crash of the program playing video (Firefox) is then triggered. It could happen any moment while playing videos. The
problem is not observed in Linux 6.9, it appeared only after upgrading to 6.10-rc1.

I'm new to amdgpu and I've peeked into source codes but couldn't find a call chain for the error reporting code. I've
also went through drm-next commit logs, and from my understanding the updates introduced to 6.10 are many and I
couldn't bisect them all. However, I'm happy to provide you with a kdump or a process dump if you request so. Please
also allow me to know how I can provide you with more information.

My system information: Ryzen 7840 HS, 512MB dedicated VRAM configured, Mesa 24.0.8, kernel 6.10-rc1.

Hoping to hear from you soon.

-- 
Sincerely yours,
WANG Yunchen
Senior, UM-SJTU Joint Institute


More information about the amd-gfx mailing list