[bug report] drm/amdgpu: amdgpu crash on playing videos, linux 6.10-rc

Alex Deucher alexdeucher at gmail.com
Wed May 29 13:48:19 UTC 2024


DId you also update mesa?  There could be a UMD change that causes the
page faults.

Alex

On Wed, May 29, 2024 at 3:37 AM Christian König
<christian.koenig at amd.com> wrote:
>
> Hi,
>
> when the issue is easy to reproduce I suggest to bisect the changes
> between 6.9 and 6.10-rc1.
>
> On the other hand it's not unlikely that we have a known bug in -rc1
> which will be fixed by -rc2.
>
> Anyway added Leo to the mail thread since he is the one responsible for
> the video decoding engines.
>
> Regards,
> Christian.
>
> Am 29.05.24 um 06:05 schrieb Wang Yunchen:
> > Hello,
> >
> > After upgrading to Linux 6.10-rc1 (Mesa is left untouched) I identified a strange bug that could cause the GPU to
> > crash and reset while playing videos online with VA-API. The screen would first start to jitter, then flicker once or
> > twice, but the desktop session couldn't be brought back. After a reboot I find the following messages in system logs:
> >
> > 10:13:05 kernel: gmc_v11_0_process_interrupt: 52 callbacks suppressed
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:05 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:06 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:10 kernel: gmc_v11_0_process_interrupt: 222971 callbacks suppressed
> > 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> > 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> > 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> > 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> > 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:10 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:11 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:15 kernel: gmc_v11_0_process_interrupt: 236783 callbacks suppressed
> > 10:13:15 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00103A11
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: unknown (0x1d)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be8000 from client 18
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b05000 from client 18
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b00000 from client 18
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103be6000 from client 18
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:8 vmid:1 pasid:32777)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:  in process RDD Process pid 2857 thread firefox:cs0 pid 2909)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800103b01000 from client 18
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: VMC (0x0)
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
> > 10:13:16 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_unified_0 timeout, signaled seq=5197, emitted
> > seq=5200
> > 10:13:16 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process RDD Process pid 2857 thread
> > firefox:cs0 pid 2909
> > 10:13:16 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
> > 10:13:16 kernel: [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
> > 10:13:17 kernel: [drm] Register(0) [regUVD_RB_RPTR] failed to reach value 0x00000340 != 0x000002c0n
> > 10:13:17 kernel: [drm] Register(0) [regUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: MODE2 reset
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
> > 10:13:17 kernel: [drm] PCIE GART of 512M enabled (table at 0x000000801FD00000).
> > 10:13:17 kernel: [drm] VRAM is lost due to GPU reset!
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
> > 10:13:17 kernel: [drm] DMUB hardware initialized: version=0x08003A00
> > 10:13:17 kernel: [drm] kiq ring mec 3 pipe 1 q 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow start
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow done
> > 10:13:17 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset(1) succeeded!
> > 10:13:17 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
> >
> > A crash of the program playing video (Firefox) is then triggered. It could happen any moment while playing videos. The
> > problem is not observed in Linux 6.9, it appeared only after upgrading to 6.10-rc1.
> >
> > I'm new to amdgpu and I've peeked into source codes but couldn't find a call chain for the error reporting code. I've
> > also went through drm-next commit logs, and from my understanding the updates introduced to 6.10 are many and I
> > couldn't bisect them all. However, I'm happy to provide you with a kdump or a process dump if you request so. Please
> > also allow me to know how I can provide you with more information.
> >
> > My system information: Ryzen 7840 HS, 512MB dedicated VRAM configured, Mesa 24.0.8, kernel 6.10-rc1.
> >
> > Hoping to hear from you soon.
> >
>


More information about the amd-gfx mailing list