nouveau_fan_update: possible circular locking dependency detected

Ilia Mirkin imirkin at alum.mit.edu
Thu Mar 13 06:38:45 PDT 2014


On Sun, Mar 9, 2014 at 10:51 AM, Marcin Slusarz
<marcin.slusarz at gmail.com> wrote:
> [  326.168487] ======================================================
> [  326.168491] [ INFO: possible circular locking dependency detected ]
> [  326.168496] 3.13.6 #1270 Not tainted
> [  326.168500] -------------------------------------------------------
> [  326.168504] ldconfig/22297 is trying to acquire lock:
> [  326.168507]  (&(&priv->fan->lock)->rlock){-.-...}, at: [<ffffffffa00d5363>] nouveau_fan_update+0xeb/0x252 [nouveau]
> [  326.168551]
> but task is already holding lock:
> [  326.168555]  (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at: [<ffffffffa00d6a8a>] alarm_timer_callback+0xf1/0x179 [nouveau]
> [  326.168587]
> which lock already depends on the new lock.
>
> [  326.168592]
> the existing dependency chain (in reverse order) is:
> [  326.168596]
> -> #1 (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}:
> [  326.168606]        [<ffffffff900a5656>] lock_acquire+0xce/0x117
> [  326.168615]        [<ffffffff905a367e>] _raw_spin_lock_irqsave+0x3f/0x51
> [  326.168623]        [<ffffffffa00d6a8a>] alarm_timer_callback+0xf1/0x179 [nouveau]
> [  326.168651]        [<ffffffffa00da895>] nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau]
> [  326.168679]        [<ffffffffa00daaa0>] nv04_timer_alarm+0xb5/0xbe [nouveau]
> [  326.168708]        [<ffffffffa00d54ac>] nouveau_fan_update+0x234/0x252 [nouveau]
> [  326.168735]        [<ffffffffa00d54df>] nouveau_fan_alarm+0x15/0x17 [nouveau]
> [  326.168763]        [<ffffffffa00da895>] nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau]
> [  326.168790]        [<ffffffffa00da90a>] nv04_timer_intr+0x5b/0x13c [nouveau]
> [  326.168817]        [<ffffffffa00d0e9b>] nouveau_mc_intr+0x2e2/0x3b1 [nouveau]
> [  326.168838]        [<ffffffff900ad359>] handle_irq_event_percpu+0x5c/0x1dc
> [  326.168846]        [<ffffffff900ad515>] handle_irq_event+0x3c/0x5c
> [  326.168852]        [<ffffffff900afa5e>] handle_edge_irq+0xc4/0xeb
> [  326.168860]        [<ffffffff9003a828>] handle_irq+0x120/0x12d
> [  326.168868]        [<ffffffff9003a2f8>] do_IRQ+0x48/0xaf
> [  326.168873]        [<ffffffff905a41ef>] ret_from_intr+0x0/0x13
> [  326.168881]        [<ffffffff90040ea2>] arch_cpu_idle+0x13/0x1d
> [  326.168887]        [<ffffffff900acb2e>] cpu_startup_entry+0x140/0x218
> [  326.168895]        [<ffffffff9005b0a0>] start_secondary+0x1bf/0x1c4
> [  326.168902]
> -> #0 (&(&priv->fan->lock)->rlock){-.-...}:
> [  326.168913]        [<ffffffff900a49cc>] __lock_acquire+0x10be/0x182b
> [  326.168920]        [<ffffffff900a5656>] lock_acquire+0xce/0x117
> [  326.168924]        [<ffffffff905a367e>] _raw_spin_lock_irqsave+0x3f/0x51
> [  326.168931]        [<ffffffffa00d5363>] nouveau_fan_update+0xeb/0x252 [nouveau]
> [  326.168958]        [<ffffffffa00d5508>] nouveau_therm_fan_set+0x14/0x16 [nouveau]
> [  326.168984]        [<ffffffffa00d4c6b>] nouveau_therm_update+0x303/0x312 [nouveau]
> [  326.169011]        [<ffffffffa00d4c8d>] nouveau_therm_alarm+0x13/0x15 [nouveau]
> [  326.169038]        [<ffffffffa00da895>] nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau]
> [  326.169059]        [<ffffffffa00daaa0>] nv04_timer_alarm+0xb5/0xbe [nouveau]
> [  326.169079]        [<ffffffffa00d6af7>] alarm_timer_callback+0x15e/0x179 [nouveau]
> [  326.169101]        [<ffffffffa00da895>] nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau]
> [  326.169121]        [<ffffffffa00da90a>] nv04_timer_intr+0x5b/0x13c [nouveau]
> [  326.169142]        [<ffffffffa00d0e9b>] nouveau_mc_intr+0x2e2/0x3b1 [nouveau]
> [  326.169160]        [<ffffffff900ad359>] handle_irq_event_percpu+0x5c/0x1dc
> [  326.169165]        [<ffffffff900ad515>] handle_irq_event+0x3c/0x5c
> [  326.169170]        [<ffffffff900afa5e>] handle_edge_irq+0xc4/0xeb
> [  326.169175]        [<ffffffff9003a828>] handle_irq+0x120/0x12d
> [  326.169179]        [<ffffffff9003a2f8>] do_IRQ+0x48/0xaf
> [  326.169183]        [<ffffffff905a41ef>] ret_from_intr+0x0/0x13
> [  326.169189]
> other info that might help us debug this:
>
> [  326.169193]  Possible unsafe locking scenario:
>
> [  326.169195]        CPU0                    CPU1
> [  326.169197]        ----                    ----
> [  326.169199]   lock(&(&priv->sensor.alarm_program_lock)->rlock);
> [  326.169205]                                lock(&(&priv->fan->lock)->rlock);
> [  326.169211]                                lock(&(&priv->sensor.alarm_program_lock)->rlock);
> [  326.169216]   lock(&(&priv->fan->lock)->rlock);
> [  326.169221]
>  *** DEADLOCK ***
>
>  [  326.169225] 1 lock held by ldconfig/22297:
>  [  326.169229]  #0:  (&(&priv->sensor.alarm_program_lock)->rlock){-.-...}, at: [<ffffffffa00d6a8a>] alarm_timer_callback+0xf1/0x179 [nouveau]
>  [  326.169253]
>  stack backtrace:
>  [  326.169258] CPU: 7 PID: 22297 Comm: ldconfig Not tainted 3.13.6 #1270
>  [  326.169260] Hardware name: System manufacturer System Product Name/P6T SE, BIOS 0603    09/02/2009
>  [  326.169264]  ffffffff90fb6360 ffff8801bfdc3a38 ffffffff9059e369 0000000000000006
>  [  326.169273]  ffffffff90fb61b0 ffff8801bfdc3a88 ffffffff905998cf 0000000000000002
>  [  326.169282]  ffff8800b148dbe0 0000000000000001 ffff8800b148e1e0 0000000000000001
>  [  326.169342] Call Trace:
>  [  326.169344]  <IRQ>  [<ffffffff9059e369>] dump_stack+0x4e/0x71
>  [  326.169352]  [<ffffffff905998cf>] print_circular_bug+0x2ad/0x2be
>  [  326.169356]  [<ffffffff900a49cc>] __lock_acquire+0x10be/0x182b
>  [  326.169360]  [<ffffffff900a3273>] ? check_irq_usage+0x99/0xab
>  [  326.169365]  [<ffffffff900a5656>] lock_acquire+0xce/0x117
>  [  326.169384]  [<ffffffffa00d5363>] ? nouveau_fan_update+0xeb/0x252 [nouveau]
>  [  326.169388]  [<ffffffff905a367e>] _raw_spin_lock_irqsave+0x3f/0x51
>  [  326.169407]  [<ffffffffa00d5363>] ? nouveau_fan_update+0xeb/0x252 [nouveau]
>  [  326.169426]  [<ffffffffa00da871>] ? nv04_timer_alarm_trigger+0x18d/0x1cb [nouveau]
>  [  326.169445]  [<ffffffffa00d5363>] nouveau_fan_update+0xeb/0x252 [nouveau]
>  [  326.169465]  [<ffffffffa00d5508>] nouveau_therm_fan_set+0x14/0x16 [nouveau]
>  [  326.169483]  [<ffffffffa00d4c6b>] nouveau_therm_update+0x303/0x312 [nouveau]
>  [  326.169502]  [<ffffffffa00d4c8d>] nouveau_therm_alarm+0x13/0x15 [nouveau]
>  [  326.169521]  [<ffffffffa00da895>] nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau]
>  [  326.169541]  [<ffffffffa00daaa0>] nv04_timer_alarm+0xb5/0xbe [nouveau]
>  [  326.169560]  [<ffffffffa00d6af7>] alarm_timer_callback+0x15e/0x179 [nouveau]
>  [  326.169579]  [<ffffffffa00da895>] nv04_timer_alarm_trigger+0x1b1/0x1cb [nouveau]
>  [  326.169598]  [<ffffffffa00da90a>] nv04_timer_intr+0x5b/0x13c [nouveau]
>  [  326.169617]  [<ffffffffa00d0e9b>] nouveau_mc_intr+0x2e2/0x3b1 [nouveau]
>  [  326.169621]  [<ffffffff900ad359>] handle_irq_event_percpu+0x5c/0x1dc
>  [  326.169624]  [<ffffffff900ad515>] handle_irq_event+0x3c/0x5c
>  [  326.169628]  [<ffffffff900afa5e>] handle_edge_irq+0xc4/0xeb
>  [  326.169631]  [<ffffffff9003a828>] handle_irq+0x120/0x12d
>  [  326.169636]  [<ffffffff90073e8c>] ? irq_enter+0x13/0x64
>  [  326.169640]  [<ffffffff9003a2f8>] do_IRQ+0x48/0xaf
>  [  326.169644]  [<ffffffff905a41ef>] common_interrupt+0x6f/0x6f
>  [  326.169646]  <EOI>  [<ffffffff905a428d>] ? retint_swapgs+0xe/0x13

Marcin, how reproducible is this? What hardware was this on? If it's
reasonably reproducible perhaps it makes sense to file a bug in the
fd.o tracker?

Martin, I think this is in code you've written (right?). Perhaps you
can take a look? All that alarm/update/etc code that ends up
immediately dispatching itself seems like a locking nightmare...

  -ilia


More information about the dri-devel mailing list