<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body><span class="vcard"><a class="email" href="mailto:imirkin@alum.mit.edu" title="Ilia Mirkin <imirkin@alum.mit.edu>"> <span class="fn">Ilia Mirkin</span></a>
</span> changed
<a class="bz_bug_link
bz_status_NEW "
title="NEW - INFO: task Xorg:2419 blocked for more than 120 seconds."
href="https://bugs.freedesktop.org/show_bug.cgi?id=91413">bug 91413</a>
<br>
<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>What</th>
<th>Removed</th>
<th>Added</th>
</tr>
<tr>
<td style="text-align:right;">CC</td>
<td>
</td>
<td>imirkin@alum.mit.edu
</td>
</tr></table>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - INFO: task Xorg:2419 blocked for more than 120 seconds."
href="https://bugs.freedesktop.org/show_bug.cgi?id=91413#c9">Comment # 9</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - INFO: task Xorg:2419 blocked for more than 120 seconds."
href="https://bugs.freedesktop.org/show_bug.cgi?id=91413">bug 91413</a>
from <span class="vcard"><a class="email" href="mailto:imirkin@alum.mit.edu" title="Ilia Mirkin <imirkin@alum.mit.edu>"> <span class="fn">Ilia Mirkin</span></a>
</span></b>
<pre>And now I'm getting the same issue after adding a GK208B and a (fanless) NV34
to my system. khugepaged gets stuck like so:
[83695.847012] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
[khugepaged:59]
[83695.847015] Modules linked in: rtl8xxxu mac80211 cfg80211 it87 hwmon_vid
nouveau uas usb_storage fbcon video bitblit softcursor font i2c_algo_bit ttm
drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt
fb_sys_fops cfbcopyarea drm backlight fb fbdev mxm_wmi wmi
[83695.847032] CPU: 0 PID: 59 Comm: khugepaged Tainted: G I L 4.6.0+
#2
[83695.847033] Hardware name: Gigabyte Technology Co., Ltd.
EX58-UD3R/EX58-UD3R, BIOS FB 05/04/2009
[83695.847035] task: ffff8801d822b000 ti: ffff8801d8314000 task.ti:
ffff8801d8314000
[83695.847036] RIP: 0010:[<ffffffff810e9fef>] [<ffffffff810e9fef>]
smp_call_function_many+0x1de/0x1f1
[83695.847042] RSP: 0018:ffff8801d8317be0 EFLAGS: 00000202
[83695.847044] RAX: 0000000000000005 RBX: ffff8801dfc169c8 RCX:
0000000000000005
[83695.847045] RDX: ffff8801dfd59328 RSI: 0000000000000008 RDI:
ffff8801dfc169c8
[83695.847046] RBP: ffff8801d8317c20 R08: 0000000000000005 R09:
0000000000000000
[83695.847047] R10: 000000000000175d R11: 0000000000000009 R12:
ffff8801dfc169c0
[83695.847049] R13: 0000000000016980 R14: 0000000000000001 R15:
0000000000000008
[83695.847050] FS: 0000000000000000(0000) GS:ffff8801dfc00000(0000)
knlGS:0000000000000000
[83695.847051] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[83695.847053] CR2: 00007f2c7ed83000 CR3: 0000000001c07000 CR4:
00000000000006f0
[83695.847054] Stack:
[83695.847055] 0100000000000000 0000000000000000 ffffffff8114d0c8
0000000000000000
[83695.847057] ffffffff8114d0c8 0000000000000000 ffffffff81f31bc8
0000000000000009
[83695.847059] ffff8801d8317c50 ffffffff810ea0a5 0000000000000008
0000000000000000
[83695.847061] Call Trace:
[83695.847065] [<ffffffff8114d0c8>] ? page_alloc_cpu_notify+0x41/0x41
[83695.847067] [<ffffffff8114d0c8>] ? page_alloc_cpu_notify+0x41/0x41
[83695.847068] [<ffffffff810ea0a5>] on_each_cpu_mask+0x28/0x48
[83695.847070] [<ffffffff8114d5f2>] drain_all_pages+0x94/0xbb
[83695.847073] [<ffffffff8114fa07>] __alloc_pages_nodemask+0x5f3/0x8b0
[83695.847075] [<ffffffff81173b64>] ? __page_set_anon_rmap+0x31/0x7d
[83695.847079] [<ffffffff8118a919>] __alloc_pages_node.isra.57+0x12/0x14
[83695.847081] [<ffffffff8118ae09>] khugepaged+0xcc/0x10d1
[83695.847084] [<ffffffff810bdebe>] ? finish_wait+0x62/0x62
[83695.847086] [<ffffffff8118ad3d>] ? maybe_pmd_mkwrite+0x1a/0x1a
[83695.847089] [<ffffffff810a8f93>] kthread+0xa5/0xad
[83695.847093] [<ffffffff81766492>] ret_from_fork+0x22/0x40
[83695.847094] [<ffffffff810a8eee>] ? init_completion+0x24/0x24
[83695.847095] Code: 74 2d 48 89 de 89 c7 e8 f8 f9 ff ff 3b 05 4e c1 c3 00 7d
1b 48 63 c8 49 8b 14 24 48 03 14 cd c0 4a d2 81 f6 42 18 01 74 04 f3 90 <eb> f6
eb d3 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 66
But looking at all the active CPUs, there's always one with
[83696.530737] NMI backtrace for cpu 5
[83696.530737] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G I L 4.6.0+
#2
[83696.530739] Hardware name: Gigabyte Technology Co., Ltd.
EX58-UD3R/EX58-UD3R, BIOS FB 05/04/2009
[83696.530740] task: ffff8801d8a3b000 ti: ffff8801d8a50000 task.ti:
ffff8801d8a50000
[83696.530741] RIP: 0010:[<ffffffff810bff24>] [<ffffffff810bff24>]
queued_spin_lock_slowpath+0x59/0x173
[83696.530742] RSP: 0018:ffff8801dfd43b98 EFLAGS: 00000002
[83696.530743] RAX: 0000000000000101 RBX: 0000000000000086 RCX:
0000000000000101
[83696.530744] RDX: 0000000000000100 RSI: 0000000000000001 RDI:
ffff8801d75ff108
[83696.530745] RBP: ffff8801dfd43b98 R08: 0000000000000001 R09:
ffff8801dfd43d53
[83696.530746] R10: ffff8801dfd43d27 R11: 0000000000000000 R12:
ffff8801d75ff108
[83696.530747] R13: ffff8801d6ede000 R14: ffff8801d75ff000 R15:
ffff8801d528d800
[83696.530748] FS: 0000000000000000(0000) GS:ffff8801dfd40000(0000)
knlGS:0000000000000000
[83696.530749] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[83696.530750] CR2: 00000000004330c1 CR3: 0000000001c07000 CR4:
00000000000006e0
[83696.530751] Stack:
[83696.530752] ffff8801dfd43bb0 ffffffff817661b4 0000000000000029
ffff8801dfd43c00
[83696.530753] ffffffffc03489b7 0000000000000010 ffff8801d7707400
ffff8801d528d700
[83696.530754] ffff8801d75ff000 0000000000000029 ffff8801d6ede000
0000000000000029
[83696.530755] Call Trace:
[83696.530756] <IRQ> d [<ffffffff817661b4>] _raw_spin_lock_irqsave+0x23/0x29
[83696.530757] [<ffffffffc03489b7>] nvkm_fantog_update+0x43/0x103 [nouveau]
[83696.530758] [<ffffffffc0348ac9>] nvkm_fantog_set+0x38/0x3f [nouveau]
[83696.530759] [<ffffffffc034818d>] nvkm_fan_update+0x12c/0x1a7 [nouveau]
[83696.530760] [<ffffffffc0348255>] nvkm_therm_fan_set+0x19/0x1b [nouveau]
[83696.530761] [<ffffffffc0347c5a>] nvkm_therm_update+0x223/0x230 [nouveau]
[83696.530762] [<ffffffffc0347c7c>] nvkm_therm_alarm+0x15/0x17 [nouveau]
[83696.530763] [<ffffffffc034a837>] nvkm_timer_alarm_trigger+0xde/0xf6
[nouveau]
[83696.530764] [<ffffffffc034a943>] nvkm_timer_alarm+0xaa/0xb3 [nouveau]
[83696.530765] [<ffffffffc0348a5c>] nvkm_fantog_update+0xe8/0x103 [nouveau]
[83696.530766] [<ffffffffc0348a8f>] nvkm_fantog_alarm+0x18/0x1a [nouveau]
[83696.530767] [<ffffffffc034a837>] nvkm_timer_alarm_trigger+0xde/0xf6
[nouveau]
[83696.530768] [<ffffffffc034abb3>] nv04_timer_intr+0x39/0x9f [nouveau]
[83696.530769] [<ffffffffc034a71f>] nvkm_timer_intr+0x14/0x16 [nouveau]
[83696.530770] [<ffffffffc03115d8>] nvkm_subdev_intr+0x17/0x19 [nouveau]
[83696.530771] [<ffffffffc033f5fb>] nvkm_mc_intr+0x81/0xd2 [nouveau]
[83696.530772] [<ffffffffc0342efa>] nvkm_pci_intr+0x4a/0x5c [nouveau]
[83696.530773] [<ffffffff810cbc26>] handle_irq_event_percpu+0x6c/0x196
[83696.530773] [<ffffffff810cbd7b>] handle_irq_event+0x2b/0x4b
[83696.530774] [<ffffffff810ce999>] handle_edge_irq+0xa6/0xc3
[83696.530775] [<ffffffff8105841b>] handle_irq+0x109/0x111
[83696.530776] [<ffffffff8176859b>] do_IRQ+0x4b/0xba
[83696.530777] [<ffffffff81766bbf>] common_interrupt+0x7f/0x7f
[83696.530778] <EOI> d [<ffffffff8157911f>] ? cpuidle_enter_state+0x103/0x15b
[83696.530779] [<ffffffff815791a3>] cpuidle_enter+0x17/0x19
[83696.530780] [<ffffffff810be61e>] cpu_startup_entry+0x192/0x1fd
[83696.530781] [<ffffffff8106f2ff>] start_secondary+0xe0/0xe3
[83696.530782] Code: ff ff 75 33 83 fe 01 89 ca 89 f0 41 0f 45 d0 f0 0f b1 17
39 f0 74 04 89 c6 eb e1 ff ca 0f 84 20 01 00 00 8b 07 84 c0 74 04 f3 90 <eb> f6
66 c7 07 01 00 e9 0c 01 00 00 48 c7 c0 40 65 01 00 65 48
Since it's in handle_irq, I assume that that takes some lock which basically
prevents the system from proceeding (except for NMIs). I can ssh into the
system, and even run some things (like these traces), but khugepaged is at
100%, and some other processes tend to hang. My VBIOS is to follow shortly.
Last I debugged this, I determined that this could happen if some time interval
we computed turned out to be 0, causing the timer to be executed from the
interrupt.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>