[Intel-gfx] Lockdep splat on drm-tip

Maarten Lankhorst maarten.lankhorst at linux.intel.com
Mon May 3 12:39:41 UTC 2021


Op 03-05-2021 om 13:57 schreef Thomas Hellström:
> Hi, Maarten,
>
> I saw this the other day while working on the TTM conversion:
>
> 5925.509765] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:928
> [ 5925.509769] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 21608, name: kworker/2:1
> [ 5925.509772] INFO: lockdep is turned off.
> [ 5925.509775] irq event stamp: 0
> [ 5925.509777] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> [ 5925.509781] hardirqs last disabled at (0): [<ffffffff81073aa0>] copy_process+0x850/0x1c70
> [ 5925.509786] softirqs last  enabled at (0): [<ffffffff81073aa0>] copy_process+0x850/0x1c70
> [ 5925.509789] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [ 5925.509792] Preemption disabled at:
> [ 5925.509793] [<ffffffff8107e2e3>] irq_enter_rcu+0x13/0x70
> [ 5925.509798] CPU: 2 PID: 21608 Comm: kworker/2:1 Tainted: G U  W         5.12.0-rc7-dmaresv-thellstr+ #3
> [ 5925.509803] Hardware name: Gigabyte Technology Co., Ltd. GB-Z390 Garuda/GB-Z390 Garuda-CF, BIOS IG1c 11/19/2019
> [ 5925.509807] Workqueue: events engine_retire [i915]
> [ 5925.509874] Call Trace:
> [ 5925.509876]  <IRQ>
> [ 5925.509878]  dump_stack+0x76/0x95
> [ 5925.509882]  ___might_sleep.cold+0xf2/0x103
> [ 5925.509887]  __might_sleep+0x4b/0x80
> [ 5925.509890]  __mutex_lock+0x5b/0x9b0
> [ 5925.509893]  ? lock_release+0x1ec/0x2b0
> [ 5925.509897]  ? debug_object_deactivate+0x137/0x160
> [ 5925.509902]  ? intel_context_post_unpin+0xb2/0x18c [i915]
> [ 5925.509960]  ? wake_up_var+0x37/0x40
> [ 5925.509964]  ? __active_retire+0x12f/0x210 [i915]
> [ 5925.510034]  mutex_lock_nested+0x1b/0x20
> [ 5925.510037]  ? i915_active_release+0x22/0x30 [i915]
> [ 5925.510105]  ? mutex_lock_nested+0x1b/0x20
> [ 5925.510108]  intel_context_post_unpin+0xb2/0x18c [i915]
> [ 5925.510166]  __intel_context_retire+0x26/0x74 [i915]
> [ 5925.510223]  __active_retire+0x11e/0x210 [i915]
> [ 5925.510291]  active_retire+0x2e/0x50 [i915]
> [ 5925.510357]  node_retire+0x23/0x30 [i915]
> [ 5925.510423]  signal_irq_work+0x318/0x6d0 [i915]
> [ 5925.510481]  irq_work_single+0x40/0x70
> [ 5925.510485]  irq_work_run_list+0x2a/0x40
> [ 5925.510488]  irq_work_run+0x2a/0x50
> [ 5925.510491]  __sysvec_irq_work+0x41/0x1b0
> [ 5925.510494]  sysvec_irq_work+0x93/0xb0
> [ 5925.510497]  </IRQ>
> [ 5925.510499]  asm_sysvec_irq_work+0x12/0x20
> [ 5925.510502] RIP: 0010:memchr_inv+0x44/0xd0
> [ 5925.510506] Code: 00 00 40 0f b6 ce 83 e8 01 49 b8 01 01 01 01 01 01 01 01 49 0f af c8 48 8d 44 c7 08 eb 09 48 83 c7 08 48 39 c7 74 64 48 3
> 9 0f <74> f2 48 8d 47 08 40 3a 37 75 7a 48 83 c7 01 48 39 f8 75 f2 31 c0
> [ 5925.510513] RSP: 0018:ffffc9000870fd68 EFLAGS: 00000246
> [ 5925.510516] RAX: ffffc90000bb7000 RBX: ffff88811f15e000 RCX: 5a5a5a5a5a5a5a5a
> [ 5925.510519] RDX: 0000000000001000 RSI: 000000000000005a RDI: ffffc90000bb61c0
> [ 5925.510522] RBP: ffffc9000870fd78 R08: 0101010101010101 R09: 0000000000000000
> [ 5925.510525] R10: ffff888105b15aec R11: 0000000000000018 R12: ffff88810980cd00
> [ 5925.510528] R13: ffff88811f15e078 R14: ffff88811f15e000 R15: ffff88811f15e000
> [ 5925.510534]  ? lrc_unpin+0x2f/0x50 [i915]
> [ 5925.510595]  intel_context_unpin+0x23/0xc0 [i915]
> [ 5925.510652]  i915_request_retire+0x21b/0x450 [i915]
> [ 5925.510722]  retire_requests+0x5b/0x80 [i915]
> [ 5925.510782]  engine_retire+0x68/0xa0 [i915]
> [ 5925.510841]  process_one_work+0x232/0x580
> [ 5925.510845]  worker_thread+0x50/0x3b0
> [ 5925.510849]  ? process_one_work+0x580/0x580
> [ 5925.510852]  kthread+0x143/0x180
> [ 5925.510854]  ? kthread_park+0x90/0x90
> [ 5925.510857]  ret_from_fork+0x1f/0x30
>
>
>
Weird backtrace, what's the mutex_lock_nested doing inside the interrupt, is the backtrace correct?

[ 5925.510037]  ? i915_active_release+0x22/0x30 [i915]
[ 5925.510105]  ? mutex_lock_nested+0x1b/0x20

Are you using PREEMPT_RT or something?


The backtrace doesn't seem to make sense as-is. I don't see mutexes being used inside unpin code. Maybe I'm missing something.

~Maarten



More information about the Intel-gfx mailing list