[Bug 205177] New: [amdgpu] driver crash - Vega10

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Sun Oct 13 01:03:14 UTC 2019


https://bugzilla.kernel.org/show_bug.cgi?id=205177

            Bug ID: 205177
           Summary: [amdgpu] driver crash - Vega10
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.3.5
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri at kernel-bugs.osdl.org
          Reporter: rob at sandersmail.eu
        Regression: No

The crash is quite random. I haven't found a reliable way of reproducing it.
Usually it happens when playing a game (steam+proton) for at least 30min.


Oct 13 00:49:00 trudex kernel: ------------[ cut here ]------------
Oct 13 00:49:00 trudex kernel: list_del corruption. next->prev should be
ffff9c1f4924d918, but was 5c865d865f866586
Oct 13 00:49:00 trudex kernel: WARNING: CPU: 1 PID: 8544 at lib/list_debug.c:54
__list_del_entry_valid+0xa4/0xb0
Oct 13 00:49:00 trudex kernel: Modules linked in: rfcomm bnep mei_hdcp mxm_wmi
amdgpu snd_hda_codec_hdmi btusb btrtl btbcm btintel bluetooth snd_hda_intel
e1000e snd_hda_codec i2c_i801 snd_oxygen ecdh_generic snd_hda_core rfkill
snd_oxygen_lib ecc snd_mpu401_uart snd_rawmidi snd_hwdep amd_iommu_v2
snd_seq_device snd_pcm gpu_sched snd_timer mei_me snd soundcore mei
intel_pch_thermal wmi intel_pmc_core hid_logitech_hidpp hid_logitech_dj
Oct 13 00:49:00 trudex kernel: CPU: 1 PID: 8544 Comm: X:cs0 Not tainted
5.3.5-847.native #1
Oct 13 00:49:00 trudex kernel: Hardware name: To Be Filled By O.E.M. To Be
Filled By O.E.M./Z170M Pro4S, BIOS P7.40 01/23/2018
Oct 13 00:49:00 trudex kernel: RIP: 0010:__list_del_entry_valid+0xa4/0xb0
Oct 13 00:49:00 trudex kernel: Code: 0f 0b 31 c0 eb bf 48 89 f2 48 89 fe 48 c7
c7 a0 76 81 b6 e8 2d 84 ae ff 0f 0b 31 c0 eb a7 48 c7 c7 e0 76 81 b6 e8 1b 84
ae ff <0f> 0b 31 c0 eb 95 90 90 90 90 90 90 55 44 8b 0e 45 31 d2 48 89 e5
Oct 13 00:49:00 trudex kernel: RSP: 0018:ffff9fc7c1833ab0 EFLAGS: 00010282
Oct 13 00:49:00 trudex kernel: RAX: 0000000000000000 RBX: ffff9c1f4924d918 RCX:
0000000000000000
Oct 13 00:49:00 trudex kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI:
0000000000000000
Oct 13 00:49:00 trudex kernel: RBP: ffff9fc7c1833ab0 R08: 0000000000000000 R09:
0000000000000000
Oct 13 00:49:00 trudex kernel: R10: 0000000000000000 R11: 0000000000000000 R12:
ffff9c1f4924d850
Oct 13 00:49:00 trudex kernel: R13: ffff9c1c54504058 R14: ffff9c1d9fa11800 R15:
ffff9c1dc80a9f10
Oct 13 00:49:00 trudex kernel: FS:  00007f0ae6ec4700(0000)
GS:ffff9c2196240000(0000) knlGS:0000000000000000
Oct 13 00:49:00 trudex kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Oct 13 00:49:00 trudex kernel: CR2: 00007f8c91a00000 CR3: 00000003a2076001 CR4:
00000000003606e0
Oct 13 00:49:00 trudex kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
Oct 13 00:49:00 trudex kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
Oct 13 00:49:00 trudex kernel: Call Trace:
Oct 13 00:49:00 trudex kernel:  ttm_bo_del_from_lru+0x30/0x120
Oct 13 00:49:00 trudex kernel:  ttm_bo_move_to_lru_tail+0x12/0xc0
Oct 13 00:49:00 trudex kernel:  amdgpu_vm_move_to_lru_tail+0x82/0xc0 [amdgpu]
Oct 13 00:49:00 trudex kernel:  amdgpu_cs_ioctl+0x98f/0xa60 [amdgpu]
Oct 13 00:49:00 trudex kernel:  ? __switch_to_asm+0x40/0x70
Oct 13 00:49:00 trudex kernel:  ? amdgpu_cs_vm_handling+0x3f0/0x3f0 [amdgpu]
Oct 13 00:49:00 trudex kernel:  drm_ioctl_kernel+0x94/0xd0
Oct 13 00:49:00 trudex kernel:  drm_ioctl+0x249/0x430
Oct 13 00:49:00 trudex kernel:  ? amdgpu_cs_vm_handling+0x3f0/0x3f0 [amdgpu]
Oct 13 00:49:00 trudex kernel:  ? futex_wake+0x77/0x150
Oct 13 00:49:00 trudex kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
Oct 13 00:49:00 trudex kernel:  do_vfs_ioctl+0x431/0x630
Oct 13 00:49:00 trudex kernel:  ? __se_sys_futex+0x12c/0x160
Oct 13 00:49:00 trudex kernel:  ksys_ioctl+0x6a/0x90
Oct 13 00:49:00 trudex kernel:  __x64_sys_ioctl+0x15/0x20
Oct 13 00:49:00 trudex kernel:  do_syscall_64+0x59/0x1f0
Oct 13 00:49:00 trudex kernel:  ? prepare_exit_to_usermode+0xa4/0xd0
Oct 13 00:49:00 trudex kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Oct 13 00:49:00 trudex kernel: RIP: 0033:0x7f0aee7d847b
Oct 13 00:49:00 trudex kernel: Code: 0f 1e fa 48 8b 05 05 3a 0d 00 64 c7 00 26
00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00
0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 39 0d 00 f7 d8 64 89 01 48
Oct 13 00:49:00 trudex kernel: RSP: 002b:00007f0ae6ec3a08 EFLAGS: 00003246
ORIG_RAX: 0000000000000010
Oct 13 00:49:00 trudex kernel: RAX: ffffffffffffffda RBX: 00000000c0186444 RCX:
00007f0aee7d847b
Oct 13 00:49:00 trudex kernel: RDX: 00007f0ae6ec3a60 RSI: 00000000c0186444 RDI:
000000000000000f
Oct 13 00:49:00 trudex kernel: RBP: 00007f0ae6ec3a30 R08: 00007f0ae6ec3b50 R09:
00007f0ae6ec3a40
Oct 13 00:49:00 trudex kernel: R10: 0000000000000003 R11: 0000000000003246 R12:
00007f0ae6ec3a60
Oct 13 00:49:00 trudex kernel: R13: 000000000000000f R14: 000055a4169a0d98 R15:
000055a416339fe0
Oct 13 00:49:00 trudex kernel: ---[ end trace 443143ae362ace06 ]---
Oct 13 00:49:00 trudex kernel: ------------[ cut here ]------------
Oct 13 00:49:00 trudex kernel: list_del corruption. next->prev should be
ffff9c1f4924d8f8, but was bd86bf86c186c486
Oct 13 00:49:00 trudex kernel: WARNING: CPU: 1 PID: 8544 at lib/list_debug.c:54
__list_del_entry_valid+0xa4/0xb0
Oct 13 00:49:00 trudex kernel: Modules linked in: rfcomm bnep mei_hdcp mxm_wmi
amdgpu snd_hda_codec_hdmi btusb btrtl btbcm btintel bluetooth snd_hda_intel
e1000e snd_hda_codec i2c_i801 snd_oxygen ecdh_generic snd_hda_core rfkill
snd_oxygen_lib ecc snd_mpu401_uart snd_rawmidi snd_hwdep amd_iommu_v2
snd_seq_device snd_pcm gpu_sched snd_timer mei_me snd soundcore mei
intel_pch_thermal wmi intel_pmc_core hid_logitech_hidpp hid_logitech_dj
Oct 13 00:49:00 trudex kernel: CPU: 1 PID: 8544 Comm: X:cs0 Tainted: G        W
        5.3.5-847.native #1
Oct 13 00:49:00 trudex kernel: Hardware name: To Be Filled By O.E.M. To Be
Filled By O.E.M./Z170M Pro4S, BIOS P7.40 01/23/2018
Oct 13 00:49:00 trudex kernel: RIP: 0010:__list_del_entry_valid+0xa4/0xb0
Oct 13 00:49:00 trudex kernel: Code: 0f 0b 31 c0 eb bf 48 89 f2 48 89 fe 48 c7
c7 a0 76 81 b6 e8 2d 84 ae ff 0f 0b 31 c0 eb a7 48 c7 c7 e0 76 81 b6 e8 1b 84
ae ff <0f> 0b 31 c0 eb 95 90 90 90 90 90 90 55 44 8b 0e 45 31 d2 48 89 e5
Oct 13 00:49:00 trudex kernel: RSP: 0018:ffff9fc7c1833ab0 EFLAGS: 00010282
Oct 13 00:49:00 trudex kernel: RAX: 0000000000000000 RBX: ffff9c1f4924d8f8 RCX:
0000000000000000
Oct 13 00:49:00 trudex kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI:
0000000000000000
Oct 13 00:49:00 trudex kernel: RBP: ffff9fc7c1833ab0 R08: 0000000000000000 R09:
0000000000000000
Oct 13 00:49:00 trudex kernel: R10: 0000000000000000 R11: 0000000000000000 R12:
ffff9c1f4924d850
Oct 13 00:49:00 trudex kernel: R13: ffff9c1c54504058 R14: ffff9c1f4924d87c R15:
ffff9c1dc80a9f10
Oct 13 00:49:00 trudex kernel: FS:  00007f0ae6ec4700(0000)
GS:ffff9c2196240000(0000) knlGS:0000000000000000
Oct 13 00:49:00 trudex kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Oct 13 00:49:00 trudex kernel: CR2: 00007f8c91a00000 CR3: 00000003a2076001 CR4:
00000000003606e0
Oct 13 00:49:00 trudex kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
Oct 13 00:49:00 trudex kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
Oct 13 00:49:00 trudex kernel: Call Trace:
Oct 13 00:49:00 trudex kernel:  ttm_bo_del_from_lru+0x89/0x120
Oct 13 00:49:00 trudex kernel:  ttm_bo_move_to_lru_tail+0x12/0xc0
Oct 13 00:49:00 trudex kernel:  amdgpu_vm_move_to_lru_tail+0x82/0xc0 [amdgpu]
Oct 13 00:49:00 trudex kernel:  amdgpu_cs_ioctl+0x98f/0xa60 [amdgpu]
Oct 13 00:49:00 trudex kernel:  ? __switch_to_asm+0x40/0x70
Oct 13 00:49:00 trudex kernel:  ? amdgpu_cs_vm_handling+0x3f0/0x3f0 [amdgpu]
Oct 13 00:49:00 trudex kernel:  drm_ioctl_kernel+0x94/0xd0
Oct 13 00:49:00 trudex kernel:  drm_ioctl+0x249/0x430
Oct 13 00:49:00 trudex kernel:  ? amdgpu_cs_vm_handling+0x3f0/0x3f0 [amdgpu]
Oct 13 00:49:00 trudex kernel:  ? futex_wake+0x77/0x150
Oct 13 00:49:00 trudex kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
Oct 13 00:49:00 trudex kernel:  do_vfs_ioctl+0x431/0x630
Oct 13 00:49:00 trudex kernel:  ? __se_sys_futex+0x12c/0x160
Oct 13 00:49:00 trudex kernel:  ksys_ioctl+0x6a/0x90
Oct 13 00:49:00 trudex kernel:  __x64_sys_ioctl+0x15/0x20
Oct 13 00:49:00 trudex kernel:  do_syscall_64+0x59/0x1f0
Oct 13 00:49:00 trudex kernel:  ? prepare_exit_to_usermode+0xa4/0xd0
Oct 13 00:49:00 trudex kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Oct 13 00:49:00 trudex kernel: RIP: 0033:0x7f0aee7d847b
Oct 13 00:49:00 trudex kernel: Code: 0f 1e fa 48 8b 05 05 3a 0d 00 64 c7 00 26
00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00
0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 39 0d 00 f7 d8 64 89 01 48
Oct 13 00:49:00 trudex kernel: RSP: 002b:00007f0ae6ec3a08 EFLAGS: 00003246
ORIG_RAX: 0000000000000010
Oct 13 00:49:00 trudex kernel: RAX: ffffffffffffffda RBX: 00000000c0186444 RCX:
00007f0aee7d847b
Oct 13 00:49:00 trudex kernel: RDX: 00007f0ae6ec3a60 RSI: 00000000c0186444 RDI:
000000000000000f
Oct 13 00:49:00 trudex kernel: RBP: 00007f0ae6ec3a30 R08: 00007f0ae6ec3b50 R09:
00007f0ae6ec3a40
Oct 13 00:49:00 trudex kernel: R10: 0000000000000003 R11: 0000000000003246 R12:
00007f0ae6ec3a60
Oct 13 00:49:00 trudex kernel: R13: 000000000000000f R14: 000055a4169a0d98 R15:
000055a416339fe0
Oct 13 00:49:00 trudex kernel: ---[ end trace 443143ae362ace07 ]---
Oct 13 00:49:00 trudex kernel: general protection fault: 0000 [#1] SMP PTI
Oct 13 00:49:00 trudex kernel: CPU: 1 PID: 8544 Comm: X:cs0 Tainted: G        W
        5.3.5-847.native #1
Oct 13 00:49:00 trudex kernel: Hardware name: To Be Filled By O.E.M. To Be
Filled By O.E.M./Z170M Pro4S, BIOS P7.40 01/23/2018
Oct 13 00:49:00 trudex kernel: RIP: 0010:__list_del_entry_valid+0x29/0xb0
Oct 13 00:49:00 trudex kernel: Code: 97 48 b8 00 01 00 00 00 00 ad de 55 48 8b
17 4c 8b 47 08 48 89 e5 48 39 c2 74 30 48 b8 22 01 00 00 00 00 ad de 49 39 c0
74 3f <49> 8b 30 48 39 fe 75 4f 48 8b 52 08 48 39 f2 75 5e b8 01 00 00 00
Oct 13 00:49:00 trudex kernel: RSP: 0018:ffff9fc7c1833ab0 EFLAGS: 00010203
Oct 13 00:49:00 trudex kernel: RAX: dead000000000122 RBX: ffff9c1c1351c918 RCX:
0000000000000000
Oct 13 00:49:00 trudex kernel: RDX: 7186738679867a86 RSI: ffff9c1d9fa11dd8 RDI:
ffff9c1c1351c918
Oct 13 00:49:00 trudex kernel: RBP: ffff9fc7c1833ab0 R08: 5c865d865f866586 R09:
0000000000000000
Oct 13 00:49:00 trudex kernel: R10: 0000000000000000 R11: 0000000000000000 R12:
ffff9c1c1351c850
Oct 13 00:49:00 trudex kernel: R13: 5588578858885b88 R14: ffff9c1d9fa11800 R15:
ffff9c1dc80a9a78
Oct 13 00:49:00 trudex kernel: FS:  00007f0ae6ec4700(0000)
GS:ffff9c2196240000(0000) knlGS:0000000000000000
Oct 13 00:49:00 trudex kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Oct 13 00:49:00 trudex kernel: CR2: 00007f8c91a00000 CR3: 00000003a2076001 CR4:
00000000003606e0
Oct 13 00:49:00 trudex kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
Oct 13 00:49:00 trudex kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
Oct 13 00:49:00 trudex kernel: Call Trace:
Oct 13 00:49:00 trudex kernel:  ttm_bo_del_from_lru+0x30/0x120
Oct 13 00:49:00 trudex kernel:  ttm_bo_move_to_lru_tail+0x12/0xc0
Oct 13 00:49:00 trudex kernel:  amdgpu_vm_move_to_lru_tail+0x82/0xc0 [amdgpu]
Oct 13 00:49:00 trudex kernel:  amdgpu_cs_ioctl+0x98f/0xa60 [amdgpu]
Oct 13 00:49:00 trudex kernel:  ? __switch_to_asm+0x40/0x70
Oct 13 00:49:00 trudex kernel:  ? amdgpu_cs_vm_handling+0x3f0/0x3f0 [amdgpu]
Oct 13 00:49:00 trudex kernel:  drm_ioctl_kernel+0x94/0xd0
Oct 13 00:49:00 trudex kernel:  drm_ioctl+0x249/0x430
Oct 13 00:49:00 trudex kernel:  ? amdgpu_cs_vm_handling+0x3f0/0x3f0 [amdgpu]
Oct 13 00:49:00 trudex kernel:  ? futex_wake+0x77/0x150
Oct 13 00:49:00 trudex kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
Oct 13 00:49:00 trudex kernel:  do_vfs_ioctl+0x431/0x630
Oct 13 00:49:00 trudex kernel:  ? __se_sys_futex+0x12c/0x160
Oct 13 00:49:00 trudex kernel:  ksys_ioctl+0x6a/0x90
Oct 13 00:49:00 trudex kernel:  __x64_sys_ioctl+0x15/0x20
Oct 13 00:49:00 trudex kernel:  do_syscall_64+0x59/0x1f0
Oct 13 00:49:00 trudex kernel:  ? prepare_exit_to_usermode+0xa4/0xd0
Oct 13 00:49:00 trudex kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Oct 13 00:49:00 trudex kernel: RIP: 0033:0x7f0aee7d847b
Oct 13 00:49:00 trudex kernel: Code: 0f 1e fa 48 8b 05 05 3a 0d 00 64 c7 00 26
00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00
0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 39 0d 00 f7 d8 64 89 01 48
Oct 13 00:49:00 trudex kernel: RSP: 002b:00007f0ae6ec3a08 EFLAGS: 00003246
ORIG_RAX: 0000000000000010
Oct 13 00:49:00 trudex kernel: RAX: ffffffffffffffda RBX: 00000000c0186444 RCX:
00007f0aee7d847b
Oct 13 00:49:00 trudex kernel: RDX: 00007f0ae6ec3a60 RSI: 00000000c0186444 RDI:
000000000000000f
Oct 13 00:49:00 trudex kernel: RBP: 00007f0ae6ec3a30 R08: 00007f0ae6ec3b50 R09:
00007f0ae6ec3a40
Oct 13 00:49:00 trudex kernel: R10: 0000000000000003 R11: 0000000000003246 R12:
00007f0ae6ec3a60
Oct 13 00:49:00 trudex kernel: R13: 000000000000000f R14: 000055a4169a0d98 R15:
000055a416339fe0
Oct 13 00:49:00 trudex kernel: Modules linked in: rfcomm bnep mei_hdcp mxm_wmi
amdgpu snd_hda_codec_hdmi btusb btrtl btbcm btintel bluetooth snd_hda_intel
e1000e snd_hda_codec i2c_i801 snd_oxygen ecdh_generic snd_hda_core rfkill
snd_oxygen_lib ecc snd_mpu401_uart snd_rawmidi snd_hwdep amd_iommu_v2
snd_seq_device snd_pcm gpu_sched snd_timer mei_me snd soundcore mei
intel_pch_thermal wmi intel_pmc_core hid_logitech_hidpp hid_logitech_dj
Oct 13 00:49:00 trudex kernel: ---[ end trace 443143ae362ace08 ]---
Oct 13 00:49:00 trudex systemd[1]: Started Telemetrics Daemon.
Oct 13 00:49:00 trudex systemd[1]: Started Telemetrics Post Daemon.
Oct 13 00:49:01 trudex kernel: RIP: 0010:__list_del_entry_valid+0x29/0xb0
Oct 13 00:49:01 trudex kernel: Code: 97 48 b8 00 01 00 00 00 00 ad de 55 48 8b
17 4c 8b 47 08 48 89 e5 48 39 c2 74 30 48 b8 22 01 00 00 00 00 ad de 49 39 c0
74 3f <49> 8b 30 48 39 fe 75 4f 48 8b 52 08 48 39 f2 75 5e b8 01 00 00 00
Oct 13 00:49:01 trudex kernel: RSP: 0018:ffff9fc7c1833ab0 EFLAGS: 00010203
Oct 13 00:49:01 trudex kernel: RAX: dead000000000122 RBX: ffff9c1c1351c918 RCX:
0000000000000000
Oct 13 00:49:01 trudex kernel: RDX: 7186738679867a86 RSI: ffff9c1d9fa11dd8 RDI:
ffff9c1c1351c918
Oct 13 00:49:01 trudex kernel: RBP: ffff9fc7c1833ab0 R08: 5c865d865f866586 R09:
0000000000000000
Oct 13 00:49:01 trudex kernel: R10: 0000000000000000 R11: 0000000000000000 R12:
ffff9c1c1351c850
Oct 13 00:49:01 trudex kernel: R13: 5588578858885b88 R14: ffff9c1d9fa11800 R15:
ffff9c1dc80a9a78
Oct 13 00:49:01 trudex kernel: FS:  00007f0ae6ec4700(0000)
GS:ffff9c2196240000(0000) knlGS:0000000000000000
Oct 13 00:49:01 trudex kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Oct 13 00:49:01 trudex kernel: CR2: 00007f8c91a00000 CR3: 00000003a2076001 CR4:
00000000003606e0
Oct 13 00:49:01 trudex kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
Oct 13 00:49:01 trudex kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400


$ lspci | grep VGA
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega
10 XT [Radeon RX Vega 64] (rev c3)


$ cat /etc/os-release 
NAME="Clear Linux OS"
VERSION=1
ID=clear-linux-os
ID_LIKE=clear-linux-os
VERSION_ID=31290
PRETTY_NAME="Clear Linux OS"

Base Board Information
        Manufacturer: ASRock
        Product Name: Z170M Pro4S

Processor Information
        Version: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the dri-devel mailing list