[Nouveau] [Bug 100567] Nouveau system freeze fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Sat Jan 5 21:29:37 UTC 2019
https://bugs.freedesktop.org/show_bug.cgi?id=100567
--- Comment #18 from kenorb at gmail.com ---
The same problem on Ubuntu 18.10, kernel 4.18.0-13.
I've got 4x GPU: GTX 1080 Ti (3-Way SLI Connector), NVIDIA GeForce GTX 1080 Ti
graphics card with 3584 cores.
$ uname -a
Linux Ubuntu-PC 4.18.0-13-generic #14-Ubuntu SMP Wed Dec 5 09:04:24 UTC 2018
x86_64 x86_64 x86_64 GNU/Linux
Errors in kern.log file:
nouveau 0000:65:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
nouveau 0000:65:00.0: fifo: runlist 0: scheduled for recovery
nouveau 0000:65:00.0: fifo: channel 2: killed
nouveau 0000:65:00.0: fifo: engine 0: scheduled for recovery
nouveau 0000:65:00.0: Xorg[5447]: channel 2 killed!
nouveau 0000:65:00.0: systemd-logind[3394]: nv50cal_space: -16
nouveau 0000:65:00.0: systemd-logind[3394]: nv50cal_space: -16
(the same message repeated 800x over and over again)
The system got freeze (no mouse or keyboard reaction), however kernel reacted
on few Magic SysRq keys, so here are some stack traces:
INFO: task kworker/u72:8:492 blocked for more than 120 seconds.
Tainted: G O 4.18.0-13-generic #14-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u72:8 D 0 492 2 0x80000000
Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
Call Trace at 20:25:50:
__schedule+0x29e/0x840
schedule+0x2c/0x80
schedule_timeout+0x258/0x360
? nv50_wndw_atomic_destroy_state+0x1d/0x20 [nouveau]
dma_fence_default_wait+0x1fc/0x260
? dma_fence_release+0xa0/0xa0
dma_fence_wait_timeout+0x3e/0xf0
drm_atomic_helper_wait_for_fences+0x3f/0xc0 [drm_kms_helper]
nv50_disp_atomic_commit_tail+0x78/0x860 [nouveau]
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
process_one_work+0x20f/0x3c0
worker_thread+0x34/0x400
kthread+0x120/0x140
? pwq_unbound_release_workfn+0xd0/0xd0
? kthread_bind+0x40/0x40
ret_from_fork+0x35/0x40
Same call trace at 20:29:51 (few minutes later while Xorg was frozen):
Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
Call Trace:
__schedule+0x29e/0x840
? apic_timer_interrupt+0xa/0x20
? __drm_crtc_commit_free+0x12/0x20 [drm]
schedule+0x2c/0x80
schedule_timeout+0x258/0x360
? nv50_wndw_atomic_destroy_state+0x1d/0x20 [nouveau]
dma_fence_default_wait+0x1fc/0x260
? dma_fence_release+0xa0/0xa0
dma_fence_wait_timeout+0x3e/0xf0
drm_atomic_helper_wait_for_fences+0x3f/0xc0 [drm_kms_helper]
nv50_disp_atomic_commit_tail+0x78/0x860 [nouveau]
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
process_one_work+0x20f/0x3c0
worker_thread+0x34/0x400
kthread+0x120/0x140
? pwq_unbound_release_workfn+0xd0/0xd0
? kthread_bind+0x40/0x40
ret_from_fork+0x35/0x40
Another one:
INFO: task Xorg:5447 blocked for more than 120 seconds.
Tainted: G O 4.18.0-13-generic #14-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Xorg D 0 5447 5445 0x00000004
Call Trace:
__schedule+0x29e/0x840
schedule+0x2c/0x80
schedule_preempt_disabled+0xe/0x10
__ww_mutex_lock.isra.6+0x3c1/0x660
__ww_mutex_lock_slowpath+0x16/0x20
ww_mutex_lock+0x34/0x50
drm_modeset_lock+0x6e/0xb0 [drm]
drm_crtc_get_sequence_ioctl+0xbc/0x190 [drm]
? drm_wait_vblank_ioctl+0x610/0x610 [drm]
drm_ioctl_kernel+0xa4/0xf0 [drm]
drm_ioctl+0x227/0x400 [drm]
? drm_wait_vblank_ioctl+0x610/0x610 [drm]
? do_iter_write+0xe1/0x1a0
? do_iter_write+0xe1/0x1a0
nouveau_drm_ioctl+0x73/0xc0 [nouveau]
do_vfs_ioctl+0xa8/0x620
? __sys_recvmsg+0x88/0xa0
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x5a/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f3f654b93c7
Code: Bad RIP value.
RSP: 002b:00007ffd57bbf168 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffd57bbf200 RCX: 00007f3f654b93c7
RDX: 00007ffd57bbf1a0 RSI: 00000000c018643b RDI: 000000000000000e
RBP: 00007ffd57bbf1a0 R08: 0000000000000000 R09: 00005646eb8ff7c0
R10: 00005646eb54ad30 R11: 0000000000000246 R12: 00000000c018643b
R13: 000000000000000e R14: 00005646eb54b800 R15: 00005646eb466880
Full log: https://gist.github.com/kenorb/5b95caa1694dbf7f030ccc808a110856
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20190105/e8c42fd3/attachment.html>
More information about the Nouveau
mailing list