[PATCH v4 0/9] drm/i915: PREEMPT_RT related fixups.

Sebastian Andrzej Siewior bigeasy at linutronix.de
Thu Aug 21 11:13:48 UTC 2025


On 2025-07-21 14:06:48 [+0900], Romain Guyard wrote:
> Hello,
Hi,

> [ 2349.629427] Hardware name: ADLINK TECHNOLOGY Inc. -612X/-612X, BIOS
> [ 2349.629454]  </TASK>
> [ 2412.634282] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 2412.634284] rcu:     Tasks blocked on level-0 rcu_node (CPUs 0-15): P12083/1:b..l P12724/1:b..l P12725/1:b..l P4057/3:b..l
> [ 2412.634289] rcu:     (detected by 14, t=147008 jiffies, g=355917, q=9582 ncpus=16)
> [ 2412.634290] task:Xorg            state:D stack:0     pid:4057 tgid:4057 ppid:4055   task_flags:0x400100 flags:0x00004000
> [ 2412.634292] Call Trace:
> [ 2412.634293]  <TASK>
> [ 2412.634295]  __schedule+0x44c/0xad0
> [ 2412.634302]  schedule_rtlock+0x25/0x40
> [ 2412.634303]  rtlock_slowlock_locked+0x20d/0xe00
> [ 2412.634307]  rt_spin_lock+0x7a/0xd0
> [ 2412.634309]  execlists_submission_tasklet+0x143/0x14d0
> [ 2412.634354]  tasklet_action_common+0xc1/0x230
> [ 2412.634356]  handle_softirqs.constprop.0+0xce/0x280
> [ 2412.634358]  __local_bh_enable_ip+0xa0/0xd0
> [ 2412.634359]  i915_gem_do_execbuffer+0x1a73/0x2920

This blocks on a lock and waits to make progress. I did not find out who
is holding that one but.

…

> [ 2412.634511]  </TASK>
> [ 2412.634511] task:kworker/14:1    state:R  running task stack:0    pid:12083 tgid:12083 ppid:2      task_flags:0x4208060 flags:0x00004000
> [ 2412.634513] Workqueue: i915-unordered engine_retire
> [ 2412.634515] Call Trace:
> [ 2412.634516]  <TASK>
> [ 2412.634516]  __schedule+0x44c/0xad0
> [ 2412.634520]  preempt_schedule_common+0x31/0x80
> [ 2412.634521]  preempt_schedule_thunk+0x16/0x30
> [ 2412.634523]  migrate_enable+0xe6/0x100
> [ 2412.634525]  rt_spin_unlock+0x12/0x40
> [ 2412.634526]  remove_from_engine+0x76/0xc0
> [ 2412.634528]  i915_request_retire.part.0+0x7c/0x220
> [ 2412.634530]  engine_retire+0xc3/0x100
> [ 2412.634531]  process_one_work+0x166/0x390
> [ 2412.634533]  worker_thread+0x29d/0x3c0

this might be the one. The task is running state so I don't understand
what is holding the scheduler back to put it back on the CPU.
There is at least one CPU idle available but this workqueue is called
i915-unordered but must complete on the same CPU (it can't migrate). So
what is CPU14 doing? It should schedule something and not be idle.

> Looks like there are some i915 locking stuff in those BTs.
> 
> I am not very knowledgeable about i915 and RT, so my help is quite limited,
> but since this is easily reproduced (always crash or hangs after <1H), I can
> try things.

I don't know what you can retrieve from the kdump but CPU14 should be
spinning on something I guess. RCU complains about not making progress.
If RCU-boost is enabled then the kworker should have one more reason to
be on the CPU.
Could you try v6.17-rc? I didn't add anything i915 related.
Could lease please enable CONFIG_PROVE_LOCKING,
CONFIG_DEBUG_ATOMIC_SLEEP and check if the kernel complains? Maybe there
is something new I haven't noticed. 

> Thank you!
> 
> Romain Guyard

Sebastian


More information about the Intel-xe mailing list