[Intel-gfx] [PATCH] semaphore: might_lock(->pi_lock) in up()
Daniel Vetter
daniel.vetter at ffwll.ch
Tue Mar 20 07:29:53 UTC 2018
In one of our bugs we very rarely see a lockdep splat on boot-up. The
locking loop only happens when the console_sem semaphore is contended
in a specific path, which doesn't happen all too often. Help out
lockdep a bit with a might_lock annotation.
For reference, the full splat:
WARNING: CPU: 0 PID: 216 at lib/debugobjects.c:291 debug_print_object+0x67/0x80
---[ end trace 1dcad5e101ee43df ]---
======================================================
WARNING: possible circular locking dependency detected
4.16.0-rc5-g178cfb9373cc-drmtip_2+ #1 Not tainted
------------------------------------------------------
systemd-udevd/216 is trying to acquire lock:
((console_sem).lock){..-.}, at: [<0000000073293a25>] down_trylock+0xa/0x30
but task is already holding lock:
(&obj_hash[i].lock){-.-.}, at: [<00000000023c4e37>] debug_check_no_obj_freed+0xa4/0x230
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&obj_hash[i].lock){-.-.}:
__debug_object_init+0x5f/0x430
hrtimer_init+0x1b/0x150
init_dl_task_timer+0x17/0x30
__sched_fork.isra.16+0xa7/0x100
init_idle+0x53/0x270
sched_init+0x42f/0x4a5
start_kernel+0x20e/0x3f0
secondary_startup_64+0xa5/0xb0
-> #2 (&rq->lock){-.-.}:
task_fork_fair+0x36/0x120
sched_fork+0x12f/0x2a0
copy_process.part.7+0x5e4/0x1d90
_do_fork+0xc0/0x6b0
kernel_thread+0x20/0x30
rest_init+0x1d/0x220
start_kernel+0x3e8/0x3f0
secondary_startup_64+0xa5/0xb0
-> #1 (&p->pi_lock){-.-.}:
try_to_wake_up+0x27/0x610
up+0x3b/0x50
__up_console_sem+0x2e/0x50
console_unlock+0x31f/0x650
do_con_write.part.12+0x6fd/0x9a0
con_write+0x4d/0x60
do_output_char+0x168/0x1d0
n_tty_write+0x1f8/0x450
tty_write+0x1b6/0x2e0
do_iter_write+0x140/0x180
vfs_writev+0x84/0xf0
do_writev+0x4c/0xd0
do_syscall_64+0x65/0x1a0
entry_SYSCALL_64_after_hwframe+0x42/0xb7
-> #0 ((console_sem).lock){..-.}:
_raw_spin_lock_irqsave+0x33/0x50
down_trylock+0xa/0x30
__down_trylock_console_sem+0x20/0x80
console_trylock+0xe/0x60
vprintk_emit+0x24f/0x4f0
printk+0x3e/0x46
__warn_printk+0x37/0x70
debug_print_object+0x67/0x80
debug_check_no_obj_freed+0x1ef/0x230
__vunmap+0x5b/0x110
load_module+0x2347/0x2dc0
SyS_finit_module+0xa5/0xe0
do_syscall_64+0x65/0x1a0
entry_SYSCALL_64_after_hwframe+0x42/0xb7
other info that might help us debug this:
Chain exists of:
(console_sem).lock --> &rq->lock --> &obj_hash[i].lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&obj_hash[i].lock);
lock(&rq->lock);
lock(&obj_hash[i].lock);
lock((console_sem).lock);
*** DEADLOCK ***
1 lock held by systemd-udevd/216:
#0: (&obj_hash[i].lock){-.-.}, at: [<00000000023c4e37>] debug_check_no_obj_freed+0xa4/0x230
stack backtrace:
CPU: 0 PID: 216 Comm: systemd-udevd Not tainted 4.16.0-rc5-g178cfb9373cc-drmtip_2+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
Call Trace:
dump_stack+0x5f/0x86
print_circular_bug.isra.18+0x1d0/0x2c0
__lock_acquire+0x14ae/0x1b60
? lock_acquire+0xaf/0x200
lock_acquire+0xaf/0x200
? down_trylock+0xa/0x30
? vprintk_emit+0x24f/0x4f0
_raw_spin_lock_irqsave+0x33/0x50
? down_trylock+0xa/0x30
down_trylock+0xa/0x30
__down_trylock_console_sem+0x20/0x80
console_trylock+0xe/0x60
vprintk_emit+0x24f/0x4f0
printk+0x3e/0x46
? clock_was_set_work+0x20/0x20
__warn_printk+0x37/0x70
? debug_check_no_obj_freed+0xa4/0x230
? clock_was_set_work+0x20/0x20
debug_print_object+0x67/0x80
debug_check_no_obj_freed+0x1ef/0x230
__vunmap+0x5b/0x110
load_module+0x2347/0x2dc0
? show_coresize+0x20/0x20
? vfs_read+0x126/0x150
? SyS_finit_module+0xa5/0xe0
SyS_finit_module+0xa5/0xe0
do_syscall_64+0x65/0x1a0
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x7f5037e42839
RSP: 002b:00007fff88b69df8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 00005599ad713550 RCX: 00007f5037e42839
RDX: 0000000000000000 RSI: 00007f5037b210e5 RDI: 000000000000000f
RBP: 00007f5037b210e5 R08: 0000000000000000 R09: 00007fff88b69f10
R10: 000000000000000f R11: 0000000000000246 R12: 0000000000000000
R13: 00005599ad707370 R14: 0000000000020000 R15: 00005599ad713550
References: https://bugs.freedesktop.org/show_bug.cgi?id=105600
Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
---
For now this is just a quick test patch for CI. I'll resubmit once we
gathered some data.
-Daniel
---
kernel/locking/semaphore.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
index 561acdd39960..6a9a671cef33 100644
--- a/kernel/locking/semaphore.c
+++ b/kernel/locking/semaphore.c
@@ -181,6 +181,9 @@ void up(struct semaphore *sem)
unsigned long flags;
raw_spin_lock_irqsave(&sem->lock, flags);
+ /* if we wake up a task we need its ->pi_lock, just pretend to take ours
+ * for the lockdep accounting. */
+ might_lock(¤t->pi_lock);
if (likely(list_empty(&sem->wait_list)))
sem->count++;
else
--
2.16.2
More information about the Intel-gfx
mailing list