4.17-rc3 oops and cpu blocked

Dave Airlie airlied at gmail.com
Tue May 1 05:42:38 UTC 2018


I was running latest drm-next kernel + radv with sdma support + Vulkan CTS

dEQP-VK.synchronization.internally_synchronized_objects.pipeline_cache_compute

caused the below explosion to happen,

Dave.

[ 2119.182156] ------------[ cut here ]------------
[ 2119.182158] kernel BUG at
/home/airlied/devel/kernel/dim/src/drivers/dma-buf/reservation.c:234!
[ 2119.182166] invalid opcode: 0000 [#1] SMP PTI
[ 2119.182168] Modules linked in: xt_CHECKSUM ipt_MASQUERADE
nf_nat_masquerade_ipv4 ipt_REJECT nf_reject_ipv4 tun ip6t_REJECT
nf_reject_ipv6 xt_conntrack nfnetlink ebtable_nat ebtable_broute
bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
nf_nat_ipv6 ip6table_mangle ip6table_security iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
iptable_mangle iptable_raw iptable_security ebtable_filter ebtables
ip6table_filter ip6_tables fuse vsock snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec
snd_hwdep snd_hda_core snd_seq x86_pkg_temp_thermal coretemp
snd_seq_device snd_pcm kvm_intel kvm snd_timer snd wmi_bmof soundcore
iTCO_wdt iTCO_vendor_support hp_wmi sparse_keymap lpc_ich i2c_i801
irqbypass crc32_pclmul ghash_clmulni_intel
[ 2119.182209]  wmi amdgpu i915 mfd_core chash gpu_sched ttm video
iosf_mbi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops drm e1000e crc32c_intel i2c_dev i2c_core
[ 2119.182222] CPU: 7 PID: 3590 Comm: deqp-vk Not tainted 4.17.0-rc3+ #353
[ 2119.182224] Hardware name: Hewlett-Packard HP Z220 CMT
Workstation/1790, BIOS K51 v01.65 09/03/2013
[ 2119.182230] RIP: 0010:reservation_object_add_shared_fence+0x2cc/0x2f0
[ 2119.182232] RSP: 0018:ffff8805eedf7ae0 EFLAGS: 00010246
[ 2119.182234] RAX: 0000000000000004 RBX: ffff8805eedf7c08 RCX: dead000000000200
[ 2119.182235] RDX: ffff8805f9800218 RSI: ffff8805eec85560 RDI: ffff8805f9800218
[ 2119.182236] RBP: ffff88060b510740 R08: ffff8805eed04ce8 R09: ffff8806124eb118
[ 2119.182238] R10: ffff8805eedf7a08 R11: ffff8805f9800800 R12: 0000000000000000
[ 2119.182239] R13: ffff8805eec85560 R14: ffff88061214dd00 R15: ffff8805eec85560
[ 2119.182241] FS:  00007f9927fff700(0000) GS:ffff88062ebc0000(0000)
knlGS:0000000000000000
[ 2119.182243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2119.182244] CR2: 00007f994d761000 CR3: 00000005fb6b6001 CR4: 00000000001606e0
[ 2119.182245] Call Trace:
[ 2119.182255]  ttm_eu_fence_buffer_objects+0x4e/0x90 [ttm]
[ 2119.182292]  amdgpu_cs_ioctl+0x149f/0x1a70 [amdgpu]
[ 2119.182325]  ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu]
[ 2119.182337]  drm_ioctl_kernel+0x81/0xd0 [drm]
[ 2119.182346]  drm_ioctl+0x2f2/0x3a0 [drm]
[ 2119.182372]  ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu]
[ 2119.182375]  ? __do_fault+0x1e/0xe5
[ 2119.182404]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[ 2119.182407]  do_vfs_ioctl+0x90/0x5e0
[ 2119.182410]  ? security_file_ioctl+0x32/0x50
[ 2119.182412]  ksys_ioctl+0x70/0x80
[ 2119.182415]  __x64_sys_ioctl+0x16/0x20
[ 2119.182418]  do_syscall_64+0x48/0x100
[ 2119.182421]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2119.182424] RIP: 0033:0x7f994a67e8e7
[ 2119.182425] RSP: 002b:00007f9927ffe158 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[ 2119.182427] RAX: ffffffffffffffda RBX: 00007f9927ffe460 RCX: 00007f994a67e8e7
[ 2119.182429] RDX: 00007f9927ffe1d0 RSI: 00000000c0186444 RDI: 0000000000000005
[ 2119.182430] RBP: 00007f9927ffe1d0 R08: 00007f9927ffe2b0 R09: 00007f9927ffe1a0
[ 2119.182431] R10: 00007f9927ffe2b0 R11: 0000000000000246 R12: 00000000c0186444
[ 2119.182433] R13: 0000000000000005 R14: 00000000038a3a10 R15: 00007f9927fff9c0
[ 2119.182434] Code: 89 e7 e9 45 ff ff ff 4c 89 ef e8 e0 e5 ff ff 48
8b 54 24 08 8b 0c 24 e9 a3 fd ff ff b8 18 00 00 00 b9 01 00 00 00 e9
17 fe ff ff <0f> 0b 4c 89 7d 18 c7 45 10 01 00 00 00 83 42 28 01 e9 32
ff ff
[ 2119.182462] RIP: reservation_object_add_shared_fence+0x2cc/0x2f0
RSP: ffff8805eedf7ae0
[ 2119.182465] ---[ end trace d7fff2a47575192e ]---
[ 2144.715111] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [deqp-vk:3585]
[ 2144.715114] Modules linked in: xt_CHECKSUM ipt_MASQUERADE
nf_nat_masquerade_ipv4 ipt_REJECT nf_reject_ipv4 tun ip6t_REJECT
nf_reject_ipv6 xt_conntrack nfnetlink ebtable_nat ebtable_broute
bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
nf_nat_ipv6 ip6table_mangle ip6table_security iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
iptable_mangle iptable_raw iptable_security ebtable_filter ebtables
ip6table_filter ip6_tables fuse vsock snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec
snd_hwdep snd_hda_core snd_seq x86_pkg_temp_thermal coretemp
snd_seq_device snd_pcm kvm_intel kvm snd_timer snd wmi_bmof soundcore
iTCO_wdt iTCO_vendor_support hp_wmi sparse_keymap lpc_ich i2c_i801
irqbypass crc32_pclmul ghash_clmulni_intel
[ 2144.715158]  wmi amdgpu i915 mfd_core chash gpu_sched ttm video
iosf_mbi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops drm e1000e crc32c_intel i2c_dev i2c_core
[ 2144.715168] CPU: 0 PID: 3585 Comm: deqp-vk Tainted: G      D
   4.17.0-rc3+ #353
[ 2144.715169] Hardware name: Hewlett-Packard HP Z220 CMT
Workstation/1790, BIOS K51 v01.65 09/03/2013
[ 2144.715173] RIP: 0010:queued_spin_lock_slowpath+0xb4/0x170
[ 2144.715174] RSP: 0018:ffff8805fd553b60 EFLAGS: 00000202 ORIG_RAX:
ffffffffffffff13
[ 2144.715176] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88062ea21800
[ 2144.715177] RDX: 0000000000040101 RSI: 0000000000000101 RDI: ffff88061214dd70
[ 2144.715177] RBP: 0000000000000000 R08: 0000000000040000 R09: 0000000000000000
[ 2144.715178] R10: 000000000007fe80 R11: 0000000000000000 R12: ffff8805eed027f8
[ 2144.715179] R13: ffff8805eed00000 R14: ffff8805fe85f858 R15: ffff8805fe85f800
[ 2144.715181] FS:  00007f994290e700(0000) GS:ffff88062ea00000(0000)
knlGS:0000000000000000
[ 2144.715182] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2144.715183] CR2: 00007f99341ba4f8 CR3: 00000005fb6b6004 CR4: 00000000001606f0
[ 2144.715183] Call Trace:
[ 2144.715213]  amdgpu_bo_do_create+0x38c/0x410 [amdgpu]
[ 2144.715239]  amdgpu_bo_create+0x33/0x240 [amdgpu]
[ 2144.715241]  ? __wake_up_common_lock+0x79/0x90
[ 2144.715263]  amdgpu_gem_object_create+0x63/0xe0 [amdgpu]
[ 2144.715285]  amdgpu_gem_create_ioctl+0x1c6/0x240 [amdgpu]
[ 2144.715306]  ? amdgpu_gem_object_close+0x1f0/0x1f0 [amdgpu]
[ 2144.715316]  drm_ioctl_kernel+0x81/0xd0 [drm]
[ 2144.715322]  drm_ioctl+0x2f2/0x3a0 [drm]
[ 2144.715342]  ? amdgpu_gem_object_close+0x1f0/0x1f0 [amdgpu]
[ 2144.715344]  ? mem_cgroup_commit_charge+0xac/0x170
[ 2144.715346]  ? page_add_new_anon_rmap+0xbc/0x140
[ 2144.715365]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[ 2144.715367]  do_vfs_ioctl+0x90/0x5e0
[ 2144.715369]  ? security_file_ioctl+0x32/0x50
[ 2144.715371]  ksys_ioctl+0x70/0x80
[ 2144.715372]  __x64_sys_ioctl+0x16/0x20
[ 2144.715375]  do_syscall_64+0x48/0x100
[ 2144.715377]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2144.715379] RIP: 0033:0x7f994a67e8e7
[ 2144.715380] RSP: 002b:00007f994290d488 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[ 2144.715381] RAX: ffffffffffffffda RBX: 00007f99340055a0 RCX: 00007f994a67e8e7
[ 2144.715382] RDX: 00007f994290d4d0 RSI: 00000000c0206440 RDI: 0000000000000005
[ 2144.715383] RBP: 00007f994290d4d0 R08: 00007f99340055a0 R09: 0000000000000004
[ 2144.715384] R10: ffffffffffffff90 R11: 0000000000000246 R12: 00000000c0206440
[ 2144.715385] R13: 0000000000000005 R14: 00007f994290d568 R15: 00007f994290e9c0
[ 2144.715386] Code: 75 0d ba 01 00 00 00 f0 0f b1 17 85 c0 74 4e 44
89 c0 c1 e8 10 66 87 47 02 89 c2 c1 e2 10 85 d2 75 79 45 31 c9 eb 02
f3 90 8b 17 <66> 85 d2 75 f7 be 01 00 00 00 eb 0c 89 d0 f0 0f b1 37 39
c2 74
[ 2144.717110] watchdog: BUG: soft lockup - CPU#2 stuck for 22s!
[kworker/2:1:107]
[ 2144.717112] Modules linked in: xt_CHECKSUM ipt_MASQUERADE
nf_nat_masquerade_ipv4 ipt_REJECT nf_reject_ipv4 tun ip6t_REJECT
nf_reject_ipv6 xt_conntrack nfnetlink ebtable_nat ebtable_broute
bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
nf_nat_ipv6 ip6table_mangle ip6table_security iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
iptable_mangle iptable_raw iptable_security ebtable_filter ebtables
ip6table_filter ip6_tables fuse vsock snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec
snd_hwdep snd_hda_core snd_seq x86_pkg_temp_thermal coretemp
snd_seq_device snd_pcm kvm_intel kvm snd_timer snd wmi_bmof soundcore
iTCO_wdt iTCO_vendor_support hp_wmi sparse_keymap lpc_ich i2c_i801
irqbypass crc32_pclmul ghash_clmulni_intel
[ 2144.717145]  wmi amdgpu i915 mfd_core chash gpu_sched ttm video
iosf_mbi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops drm e1000e crc32c_intel i2c_dev i2c_core
[ 2144.717155] CPU: 2 PID: 107 Comm: kworker/2:1 Tainted: G      D
 L    4.17.0-rc3+ #353
[ 2144.717156] Hardware name: Hewlett-Packard HP Z220 CMT
Workstation/1790, BIOS K51 v01.65 09/03/2013
[ 2144.717161] Workqueue: events ttm_bo_delayed_workqueue [ttm]
[ 2144.717165] RIP: 0010:queued_spin_lock_slowpath+0x117/0x170
[ 2144.717165] RSP: 0000:ffff8805fe363e20 EFLAGS: 00000202 ORIG_RAX:
ffffffffffffff13
[ 2144.717167] RAX: 0000000000040101 RBX: ffff8805eed027f8 RCX: 0000000000000001
[ 2144.717168] RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffff88061214dd70
[ 2144.717169] RBP: ffff88062eaa0500 R08: 0000000000000101 R09: 0000000000000000
[ 2144.717170] R10: 0000000000000000 R11: 00000000000003e3 R12: ffff8805fe363e48
[ 2144.717171] R13: ffff88061214dd70 R14: 0000000000000000 R15: ffff8805eed02ee0
[ 2144.717172] FS:  0000000000000000(0000) GS:ffff88062ea80000(0000)
knlGS:0000000000000000
[ 2144.717173] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2144.717174] CR2: 00007f9926f9a000 CR3: 000000000200a006 CR4: 00000000001606e0
[ 2144.717175] Call Trace:
[ 2144.717179]  ttm_bo_delayed_delete+0x42/0x1e0 [ttm]
[ 2144.717182]  ttm_bo_delayed_workqueue+0x17/0x40 [ttm]
[ 2144.717184]  process_one_work+0x16f/0x360
[ 2144.717186]  worker_thread+0x2e/0x370
[ 2144.717188]  ? process_one_work+0x360/0x360
[ 2144.717189]  kthread+0x113/0x130
[ 2144.717191]  ? kthread_create_worker_on_cpu+0x50/0x50
[ 2144.717193]  ret_from_fork+0x35/0x40
[ 2144.717194] Code: 7e f3 c3 f3 90 8b 37 81 fe 00 01 00 00 74 f4 e9
11 ff ff ff f3 90 4c 8b 09 4d 85 c9 74 f6 eb d2 83 fa 01 75 04 eb da
f3 90 8b 07 <84> c0 75 f8 b8 01 00 00 00 66 89 07 c3 c1 ea 12 83 e0 03
83 ea


More information about the amd-gfx mailing list