[PATCH v2] drm/i915: Fix ref->mutex deadlock in i915_active_wait()

Sultan Alsawaf sultan at kerneltoast.com
Tue Apr 14 14:35:18 UTC 2020


On Tue, Apr 14, 2020 at 09:15:07AM +0100, Chris Wilson wrote:
> The patch does not fix a deadlock. Greg, this patch is not a backport of
> a bugfix, why is it in stable?
> -Chris

Here's the deadlock this supposedly doesn't fix:
INFO: task kswapd0:178 blocked for more than 122 seconds.
      Tainted: G     U            5.4.28-00014-gd1e04f91d2c5 #4
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kswapd0         D    0   178      2 0x80004000
Call Trace:
 ? __schedule+0x2f3/0x750
 schedule+0x39/0xa0
 schedule_preempt_disabled+0xa/0x10
 __mutex_lock.isra.0+0x19b/0x500
 ? i915_request_wait+0x25b/0x370
 active_retire+0x26/0x30
 i915_active_wait+0xa3/0x1a0
 i915_vma_unbind+0xe2/0x1c0
 i915_gem_object_unbind+0x111/0x140
 i915_gem_shrink+0x21b/0x530
 i915_gem_shrinker_scan+0xfd/0x120
 do_shrink_slab+0x154/0x2c0
 shrink_slab+0xd0/0x2f0
 shrink_node+0xdf/0x420
 balance_pgdat+0x2e3/0x540
 kswapd+0x200/0x3c0
 ? __wake_up_common_lock+0xc0/0xc0
 kthread+0xfb/0x130
 ? balance_pgdat+0x540/0x540
 ? __kthread_parkme+0x60/0x60
 ret_from_fork+0x1f/0x40
INFO: task kworker/u32:5:222 blocked for more than 122 seconds.
      Tainted: G     U            5.4.28-00014-gd1e04f91d2c5 #4
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u32:5   D    0   222      2 0x80004000
Workqueue: i915 idle_work_handler
Call Trace:
 ? __schedule+0x2f3/0x750
 schedule+0x39/0xa0
 schedule_preempt_disabled+0xa/0x10
 __mutex_lock.isra.0+0x19b/0x500
 idle_work_handler+0x34/0x120
 process_one_work+0x1ea/0x3a0
 worker_thread+0x4d/0x3f0
 kthread+0xfb/0x130
 ? process_one_work+0x3a0/0x3a0
 ? __kthread_parkme+0x60/0x60
 ret_from_fork+0x1f/0x40
INFO: task mpv:1535 blocked for more than 122 seconds.
      Tainted: G     U            5.4.28-00014-gd1e04f91d2c5 #4
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mpv             D    0  1535      1 0x00000000
Call Trace:
 ? __schedule+0x2f3/0x750
 schedule+0x39/0xa0
 schedule_preempt_disabled+0xa/0x10
 __mutex_lock.isra.0+0x19b/0x500
 __i915_gem_free_objects+0x68/0x190
 i915_gem_create_ioctl+0x18/0x30
 ? i915_gem_dumb_create+0xa0/0xa0
 drm_ioctl_kernel+0xb2/0x100
 drm_ioctl+0x209/0x360
 ? i915_gem_dumb_create+0xa0/0xa0
 do_vfs_ioctl+0x43f/0x6c0
 ksys_ioctl+0x5e/0x90
 __x64_sys_ioctl+0x16/0x20
 do_syscall_64+0x4e/0x140
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fb49f1b32eb
Code: Bad RIP value.
RSP: 002b:00007ffef9eb0948 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffef9eb09c0 RCX: 00007fb49f1b32eb
RDX: 00007ffef9eb09c0 RSI: 00000000c010645b RDI: 0000000000000008
RBP: 00000000c010645b R08: 000055fdb80c1370 R09: 000055fdb80c14e0
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb4781e56b0
R13: 0000000000000008 R14: 00007fb4781e5560 R15: 00007fb4781e56b0
INFO: task kswapd0:178 blocked for more than 245 seconds.
      Tainted: G     U            5.4.28-00014-gd1e04f91d2c5 #4
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kswapd0         D    0   178      2 0x80004000
Call Trace:
 ? __schedule+0x2f3/0x750
 schedule+0x39/0xa0
 schedule_preempt_disabled+0xa/0x10
 __mutex_lock.isra.0+0x19b/0x500
 ? i915_request_wait+0x25b/0x370
 active_retire+0x26/0x30
 i915_active_wait+0xa3/0x1a0
 i915_vma_unbind+0xe2/0x1c0
 i915_gem_object_unbind+0x111/0x140
 i915_gem_shrink+0x21b/0x530
 i915_gem_shrinker_scan+0xfd/0x120
 do_shrink_slab+0x154/0x2c0
 shrink_slab+0xd0/0x2f0
 shrink_node+0xdf/0x420
 balance_pgdat+0x2e3/0x540
 kswapd+0x200/0x3c0
 ? __wake_up_common_lock+0xc0/0xc0
 kthread+0xfb/0x130
 ? balance_pgdat+0x540/0x540
 ? __kthread_parkme+0x60/0x60
 ret_from_fork+0x1f/0x40
INFO: task kworker/u32:5:222 blocked for more than 245 seconds.
      Tainted: G     U            5.4.28-00014-gd1e04f91d2c5 #4
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u32:5   D    0   222      2 0x80004000
Workqueue: i915 idle_work_handler
Call Trace:
 ? __schedule+0x2f3/0x750
 schedule+0x39/0xa0
 schedule_preempt_disabled+0xa/0x10
 __mutex_lock.isra.0+0x19b/0x500
 idle_work_handler+0x34/0x120
 process_one_work+0x1ea/0x3a0
 worker_thread+0x4d/0x3f0
 kthread+0xfb/0x130
 ? process_one_work+0x3a0/0x3a0
 ? __kthread_parkme+0x60/0x60
 ret_from_fork+0x1f/0x40
INFO: task mpv:1535 blocked for more than 245 seconds.
      Tainted: G     U            5.4.28-00014-gd1e04f91d2c5 #4
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mpv             D    0  1535      1 0x00000000
Call Trace:
 ? __schedule+0x2f3/0x750
 schedule+0x39/0xa0
 schedule_preempt_disabled+0xa/0x10
 __mutex_lock.isra.0+0x19b/0x500
 __i915_gem_free_objects+0x68/0x190
 i915_gem_create_ioctl+0x18/0x30
 ? i915_gem_dumb_create+0xa0/0xa0
 drm_ioctl_kernel+0xb2/0x100
 drm_ioctl+0x209/0x360
 ? i915_gem_dumb_create+0xa0/0xa0
 do_vfs_ioctl+0x43f/0x6c0
 ksys_ioctl+0x5e/0x90
 __x64_sys_ioctl+0x16/0x20
 do_syscall_64+0x4e/0x140
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fb49f1b32eb
Code: Bad RIP value.
RSP: 002b:00007ffef9eb0948 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffef9eb09c0 RCX: 00007fb49f1b32eb
RDX: 00007ffef9eb09c0 RSI: 00000000c010645b RDI: 0000000000000008
RBP: 00000000c010645b R08: 000055fdb80c1370 R09: 000055fdb80c14e0
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb4781e56b0
R13: 0000000000000008 R14: 00007fb4781e5560 R15: 00007fb4781e56b0

Dead inside the shrinker, and very easy to reproduce.

Sultan


More information about the dri-devel mailing list