[bug report] drm/amdgpu: amdgpu crash on playing videos, linux 6.10-rc

Christian König christian.koenig at amd.com
Wed May 29 07:30:17 UTC 2024


Hi,

when the issue is easy to reproduce I suggest to bisect the changes 
between 6.9 and 6.10-rc1.

On the other hand it's not unlikely that we have a known bug in -rc1 
which will be fixed by -rc2.

Anyway added Leo to the mail thread since he is the one responsible for 
the video decoding engines.

Regards,
Christian.

Am 29.05.24 um 06:05 schrieb Wang Yunchen:
> Hello,
>
> After upgrading to Linux 6.10-rc1 (Mesa is left untouched) I identified a strange bug that could cause the GPU to
> crash and reset while playing videos online with VA-API. The screen would first start to jitter, then flicker once or
> twice, but the desktop session couldn't be brought back. After a reboot I find the following messages in system logs:
>
> 10:13:05 kernel: gmc_v11_0_process_interrupt: 52 callbacks suppressed
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:10 kernel: gmc_v11_0_process_interrupt: 222971 callbacks suppressed
> 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:15 kernel: gmc_v11_0_process_interrupt: 236783 callbacks suppressed
> 10:13:15 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> 10:13:16 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_unified_0 timeout, signaled seq=5197, emitted
> seq=5200
> 10:13:16 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process RDD Process pid 2857 thread
> firefox:cs0 pid 2909
> 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
> 10:13:16 kernel: [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
> 10:13:17 kernel: [drm] Register(0) [regUVD_RB_RPTR] failed to reach value 0x00000340 != 0x000002c0n
> 10:13:17 kernel: [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: MODE2 reset
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
> 10:13:17 kernel: [drm] PCIE GART of 512M enabled (table at 0x000000801FD00000).
> 10:13:17 kernel: [drm] VRAM is lost due to GPU reset!
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
> 10:13:17 kernel: [drm] DMUB hardware initialized: version=0x08003A00
> 10:13:17 kernel: [drm] kiq ring mec 3 pipe 1 q 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow start
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow done
> 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset(1) succeeded!
> 10:13:17 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
>
> A crash of the program playing video (Firefox) is then triggered. It could happen any moment while playing videos. The
> problem is not observed in Linux 6.9, it appeared only after upgrading to 6.10-rc1.
>
> I'm new to amdgpu and I've peeked into source codes but couldn't find a call chain for the error reporting code. I've
> also went through drm-next commit logs, and from my understanding the updates introduced to 6.10 are many and I
> couldn't bisect them all. However, I'm happy to provide you with a kdump or a process dump if you request so. Please
> also allow me to know how I can provide you with more information.
>
> My system information: Ryzen 7840 HS, 512MB dedicated VRAM configured, Mesa 24.0.8, kernel 6.10-rc1.
>
> Hoping to hear from you soon.
>



More information about the amd-gfx mailing list