[Bug 104825] [amdgpu] [drm:gfx_v8_0_hw_fini] *ERROR* KCQ disabled failed (scratch(0xC040)=0x00000000) when unbinding

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Wed Jan 31 07:21:14 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=104825

--- Comment #3 from mlen <mlen at mlen.pl> ---
I tested amd-staging-drm-next with HEAD at
f1367d12f5fabb04789c7772594887434c8d9e8b. This time the unbind succeeded, but
there are still some errors logged and kernel reports locking problem in
amdgpu:

[   77.098923] [drm] amdgpu: finishing device.
[   77.458614] [drm:gfx_v8_0_hw_fini] *ERROR* KCQ disabled failed
(scratch(0xC040)=0x00000000)
[   77.481247] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
[   77.653815] [drm:gfx_v8_0_hw_fini] *ERROR* KCQ disabled failed
(scratch(0xC040)=0x00000000)
[   77.845085] [drm:gfx_v8_0_hw_fini] *ERROR* KCQ disabled failed
(scratch(0xC040)=0x00000000)
[   77.855055] IPv6: ADDRCONF(NETDEV_CHANGE): virbr10: link becomes ready
[   78.036695] [drm:gfx_v8_0_hw_fini] *ERROR* KCQ disabled failed
(scratch(0xC040)=0x00000000)
[   78.233244] [drm:gfx_v8_0_hw_fini] *ERROR* KCQ disabled failed
(scratch(0xC040)=0x00000000)
[   78.425058] [drm:gfx_v8_0_hw_fini] *ERROR* KCQ disabled failed
(scratch(0xC040)=0x00000000)
[   78.616635] [drm:gfx_v8_0_hw_fini] *ERROR* KCQ disabled failed
(scratch(0xC040)=0x00000000)
[   78.808323] [drm:gfx_v8_0_hw_fini] *ERROR* KCQ disabled failed
(scratch(0xC040)=0x00000000)
[   78.810659] amdgpu 0000:03:00.0: 00000000a667dd57 unpin not necessary
[   78.810672] amdgpu 0000:03:00.0: 00000000a7594a2b unpin not necessary

[   78.811733] =====================================
[   78.813109] WARNING: bad unlock balance detected!
[   78.813947] 4.15.0-rc4+ #2 Not tainted
[   78.814835] -------------------------------------
[   78.815731] openrc-run.sh/3931 is trying to release lock
(&(&mgr->lock)->rlock) at:
[   78.816646] [<000000006fd39549>] amdgpu_gtt_mgr_fini+0x22/0x37
[   78.817531] but there are no more locks to release!
[   78.818446] 
               other info that might help us debug this:
[   78.820208] 5 locks held by openrc-run.sh/3931:
[   78.821127]  #0:  (sb_writers#6){....}, at: [<00000000322e5044>]
vfs_write+0x87/0xe2
[   78.822051]  #1:  (&of->mutex){....}, at: [<00000000660270c4>]
kernfs_fop_write+0xca/0x156
[   78.823007]  #2:  (kn->count#211){....}, at: [<000000000634dafb>]
kernfs_fop_write+0xd2/0x156
[   78.823936]  #3:  (&dev->mutex){....}, at: [<00000000c386f49f>]
unbind_store+0x58/0x90
[   78.824912]  #4:  (&dev->mutex){....}, at: [<00000000eefcc37f>]
device_release_driver_internal+0x2f/0x1f3
[   78.825861] 
               stack backtrace:
[   78.827764] CPU: 7 PID: 3931 Comm: openrc-run.sh Not tainted 4.15.0-rc4+ #2
[   78.828747] Hardware name: ASUSTeK COMPUTER INC. Z10PE-D16 WS/Z10PE-D16 WS,
BIOS 3407 03/10/2017
[   78.829718] Call Trace:
[   78.830717]  dump_stack+0x67/0x8e
[   78.831689]  ? amdgpu_gtt_mgr_fini+0x22/0x37
[   78.832687]  print_unlock_imbalance_bug+0xcc/0xd3
[   78.833657]  lock_release+0x134/0x267
[   78.834646]  ? _raw_spin_unlock+0x2e/0x40
[   78.835605]  _raw_spin_unlock+0x1c/0x40
[   78.836586]  amdgpu_gtt_mgr_fini+0x22/0x37
[   78.837549]  ttm_bo_clean_mm+0x79/0xab
[   78.838544]  amdgpu_ttm_fini+0x75/0x11c
[   78.839507]  amdgpu_bo_fini+0xe/0x2d
[   78.840495]  gmc_v8_0_sw_fini+0x2e/0x49
[   78.841454]  amdgpu_device_ip_fini+0x21f/0x2d3
[   78.842439]  amdgpu_device_fini+0x4c/0x125
[   78.843394]  amdgpu_driver_unload_kms+0x63/0x76
[   78.844373]  drm_dev_unregister+0x49/0xc3
[   78.845318]  amdgpu_pci_remove+0x19/0x37
[   78.846244]  pci_device_remove+0x36/0x86
[   78.847190]  device_release_driver_internal+0x122/0x1f3
[   78.848120]  unbind_store+0x60/0x90
[   78.849069]  kernfs_fop_write+0x10e/0x156
[   78.849997]  __vfs_write+0x31/0xcc
[   78.850937]  ? preempt_count_sub+0x8b/0x94
[   78.851871]  ? __sb_start_write+0xc0/0x180
[   78.852828]  vfs_write+0xa5/0xe2
[   78.853755]  SyS_write+0x5f/0xa3
[   78.854708]  do_syscall_64+0x6c/0x7b
[   78.855630]  entry_SYSCALL64_slow_path+0x25/0x25
[   78.856583] RIP: 0033:0x7fc5804b6408
[   78.857511] RSP: 002b:00007ffd95228060 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[   78.858484] RAX: ffffffffffffffda RBX: 000000000000000d RCX:
00007fc5804b6408
[   78.859438] RDX: 000000000000000d RSI: 000055b229e9e890 RDI:
0000000000000001
[   78.860419] RBP: 000055b229e9e890 R08: 000000000000000a R09:
000055b229ea45f0
[   78.861379] R10: 000000000000009b R11: 0000000000000246 R12:
000000000000000d
[   78.862364] R13: 0000000000000001 R14: 00007fc580783740 R15:
000000000000000d
[   78.863411] [drm] amdgpu: ttm finalized

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180131/838d2e88/attachment-0001.html>


More information about the dri-devel mailing list