i915: intel_gt_reset() deadlock

Sergey Senozhatsky senozhatsky at chromium.org
Mon Oct 28 04:46:21 UTC 2024


Hi,

I'm currently looking at i915 deadlock report, the report
is for 5.15, but I don't see any significant difference in
linux-next code, so it looks relevant to current upstream
code.

Basically, intel_gt_reset() grabs gt->reset.mutex and then sleeps
in flush_work().  Worker, meanwhile, cannot make progress because
it sleeps on gt->reset.mutex in intel_gt_set_wedged().

INFO: task kworker/2:1:68 blocked for more than 122 seconds.
Tainted: G U W 5.15.135-lockdep-20721-gd26c0f5bff55 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/2:1 state:D stack:26184 pid: 68 ppid: 2 flags:0x00004000
Workqueue: events intel_wedge_me
Call Trace:
<TASK>
__schedule+0xe1b/0x3bae
schedule+0xc8/0x247
schedule_preempt_disabled+0x18/0x28
__mutex_lock_common+0x99f/0x1532
mutex_lock_nested+0x20/0x2a
intel_gt_set_wedged+0xbf/0x122
process_one_work+0x8f0/0x157c
worker_thread+0x4c2/0xa4a
kthread+0x32b/0x442
ret_from_fork+0x1f/0x30
</TASK>

INFO: task kworker/2:1H:156 blocked for more than 122 seconds.
Tainted: G U W 5.15.135-lockdep-20721-gd26c0f5bff55 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/2:1H state:D stack:27112 pid: 156 ppid: 2 flags:0x00004000
Workqueue: events_highpri heartbeat
Call Trace:
<TASK>
__schedule+0xe1b/0x3bae
schedule+0xc8/0x247
schedule_timeout+0x15e/0x215
do_wait_for_common+0x2d3/0x3f9
wait_for_completion+0x51/0x5d
__flush_work+0xd9/0x131
__cancel_work_timer+0x247/0x544
intel_guc_submission_reset_prepare+0xbf/0xb01
intel_uc_reset_prepare+0x11c/0x1e0
reset_prepare+0x35/0x20d
intel_gt_reset+0x3c3/0xa3d
intel_gt_handle_error+0xb4b/0xf24
heartbeat+0xaa7/0xce5
process_one_work+0x8f0/0x157c
worker_thread+0x4c2/0xa4a
kthread+0x32b/0x442
ret_from_fork+0x1f/0x30
</TASK>


More information about the Intel-gfx mailing list