[PATCH v4 0/9] drm/i915: PREEMPT_RT related fixups.
Sebastian Andrzej Siewior
bigeasy at linutronix.de
Thu Aug 21 11:13:48 UTC 2025
On 2025-07-21 14:06:48 [+0900], Romain Guyard wrote:
> Hello,
Hi,
> [ 2349.629427] Hardware name: ADLINK TECHNOLOGY Inc. -612X/-612X, BIOS
> [ 2349.629454] </TASK>
> [ 2412.634282] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 2412.634284] rcu: Tasks blocked on level-0 rcu_node (CPUs 0-15): P12083/1:b..l P12724/1:b..l P12725/1:b..l P4057/3:b..l
> [ 2412.634289] rcu: (detected by 14, t=147008 jiffies, g=355917, q=9582 ncpus=16)
> [ 2412.634290] task:Xorg state:D stack:0 pid:4057 tgid:4057 ppid:4055 task_flags:0x400100 flags:0x00004000
> [ 2412.634292] Call Trace:
> [ 2412.634293] <TASK>
> [ 2412.634295] __schedule+0x44c/0xad0
> [ 2412.634302] schedule_rtlock+0x25/0x40
> [ 2412.634303] rtlock_slowlock_locked+0x20d/0xe00
> [ 2412.634307] rt_spin_lock+0x7a/0xd0
> [ 2412.634309] execlists_submission_tasklet+0x143/0x14d0
> [ 2412.634354] tasklet_action_common+0xc1/0x230
> [ 2412.634356] handle_softirqs.constprop.0+0xce/0x280
> [ 2412.634358] __local_bh_enable_ip+0xa0/0xd0
> [ 2412.634359] i915_gem_do_execbuffer+0x1a73/0x2920
This blocks on a lock and waits to make progress. I did not find out who
is holding that one but.
…
> [ 2412.634511] </TASK>
> [ 2412.634511] task:kworker/14:1 state:R running task stack:0 pid:12083 tgid:12083 ppid:2 task_flags:0x4208060 flags:0x00004000
> [ 2412.634513] Workqueue: i915-unordered engine_retire
> [ 2412.634515] Call Trace:
> [ 2412.634516] <TASK>
> [ 2412.634516] __schedule+0x44c/0xad0
> [ 2412.634520] preempt_schedule_common+0x31/0x80
> [ 2412.634521] preempt_schedule_thunk+0x16/0x30
> [ 2412.634523] migrate_enable+0xe6/0x100
> [ 2412.634525] rt_spin_unlock+0x12/0x40
> [ 2412.634526] remove_from_engine+0x76/0xc0
> [ 2412.634528] i915_request_retire.part.0+0x7c/0x220
> [ 2412.634530] engine_retire+0xc3/0x100
> [ 2412.634531] process_one_work+0x166/0x390
> [ 2412.634533] worker_thread+0x29d/0x3c0
this might be the one. The task is running state so I don't understand
what is holding the scheduler back to put it back on the CPU.
There is at least one CPU idle available but this workqueue is called
i915-unordered but must complete on the same CPU (it can't migrate). So
what is CPU14 doing? It should schedule something and not be idle.
> Looks like there are some i915 locking stuff in those BTs.
>
> I am not very knowledgeable about i915 and RT, so my help is quite limited,
> but since this is easily reproduced (always crash or hangs after <1H), I can
> try things.
I don't know what you can retrieve from the kdump but CPU14 should be
spinning on something I guess. RCU complains about not making progress.
If RCU-boost is enabled then the kworker should have one more reason to
be on the CPU.
Could you try v6.17-rc? I didn't add anything i915 related.
Could lease please enable CONFIG_PROVE_LOCKING,
CONFIG_DEBUG_ATOMIC_SLEEP and check if the kernel complains? Maybe there
is something new I haven't noticed.
> Thank you!
>
> Romain Guyard
Sebastian
More information about the Intel-xe
mailing list