[PATCH] dma-buf: Take a breath during dma-fence-chain subtests
Janusz Krzysztofik
janusz.krzysztofik at linux.intel.com
Tue Jul 8 08:56:41 UTC 2025
On Monday, 7 July 2025 15:14:11 CEST Christian König wrote:
> On 07.07.25 14:25, Janusz Krzysztofik wrote:
> > Hi Christian,
> >
> > I've taken over that old issue and have a few questions to you.
>
> Thanks a lot for that, something really fishy seems to be going on here.
>
> > On Thursday, 27 February 2025 15:11:39 CEST Christian König wrote:
> ...
> >> Question is why is the test taking 26 (busy?) seconds to complete? That
> > sounds really long even for a very old CPU.
> >
> > The failing test case operates on a dma fence chain built of 4096 dma fences.
> > Am I right that a single dma_fence_signal() call may result in up to 4096
> > levels of nested sub-calls to dma_fence_chain_cb()?
>
> No, that could only happen if the test case is broken.
>
> See the test is to make sure that we don't overflow the kernel stack with many elements when enable_signaling is called.
>
> The callback is always only installed on the first unsignaled fence. So when the last fence is signaled the dma_fence_chain_cb function it is called exactly once, installs itself on the second to last fence, waits for that one to signal and so on.
>
>
> What could be is that test case calls enable_signaling on each of the 4096 dma_fence_chain elements individually and so each of the elements registers a dma_fence_chain_cb function, but then the test case would be rather broken and should probably be fixed.
>
> > The test case signals
> > each fence of the chain in a loop, starting from the last one. Then,
> > dma_fence_chain_cb() is called 4096 * (4096 + 1) / 2 (an arithmetic series) ~=
> > 8.4 milion times, isn't it?
>
> When the test case indeed enables signaling on all 4096 elements then yes.
>
> >
> > On most powerful gen12 machines used in CI, that test case takes slightly less
> > than 3s, on low end few years old machines -- ~ 10s. Should we be surprised
> > that it takes over 20s on the least powerful one (gen3 PineView Atom)? And,
> > while reproducing the issue, I've never seen any unrecoverable deadlocks. It
> > just takes time to complete the loop.
> >
> > Does that address your doubts?
>
> No, not really.
>
> > Assuming the commit message will be extended and provide the above
> > explanation, I can still imagine a few ways of "fixing" that issue. We can
> > just limit the chain length and still execute all those dma_fence_signal()
> > calls without taking a breath, but why? Or we can somehow measure expected
> > processing speed before running the exercise and limit the chain size
> > accordingly at runtime, which in turn seems an overcomplication to me. Or, we
> > can agree that there is no point in avoiding that process being potentially
> > swapped out from the CPU and take the approach proposed by Nitin, perhaps
> > limited to this particular problematic test case. And, I can see
> > cond_resched() still used in recent patches.
> >
> > Would any of those options be acceptable for you?
>
> No, if the test case really does what you describe here then the test is just broken and should potentially be fixed instead.
>
> There is no reason to test enabling signaling each of the element in a loop. So there should be something like 4096 calls to the dma_fence_chain_cb function each jumping to the next unsignaled fence and re-installing the callback.
So how building a chain should look like in real use cases? When a user
builds a chained link of her fence with another fence then may she enable
signaling on the new chain link? If that other fence occurs a chain link then
who should take care of disabling signaling on it so signaling is enabled only
on the last link of the chain, not leading to a situation similar to what we
have now in the test case? IOW, what's a correct use pattern of
dma_fence_chain? I can't find that documented anywhere, neither in inline
docs nor in commit descriptions.
Thanks,
Janusz
>
> Regards,
> Christian.
>
> >
> > Thanks,
> > Janusz
> >
> >>
> >> Do we maybe have an udelay() here which should have been an usleep() or
> > similar?
> >>
> >> Regards,
> >> Christian.
> >>
> >>>
> >>> Andi
> >>>
> >>>> ---
> >>>> Hi,
> >>>>
> >>>> For reviewer reference, adding here watchdog issue seen on old machines
> >>>> during dma-fence-chain subtests testing. This log is retrieved from
> > device
> >>>> pstore log while testing dam-buf at all-tests:
> >>>>
> >>>> dma-buf: Running dma_fence_chain
> >>>> Panic#1 Part7
> >>>> <6> sizeof(dma_fence_chain)=184
> >>>> <6> dma-buf: Running dma_fence_chain/sanitycheck
> >>>> <6> dma-buf: Running dma_fence_chain/find_seqno
> >>>> <6> dma-buf: Running dma_fence_chain/find_signaled
> >>>> <6> dma-buf: Running dma_fence_chain/find_out_of_order
> >>>> <6> dma-buf: Running dma_fence_chain/find_gap
> >>>> <6> dma-buf: Running dma_fence_chain/find_race
> >>>> <6> Completed 4095 cycles
> >>>> <6> dma-buf: Running dma_fence_chain/signal_forward
> >>>> <6> dma-buf: Running dma_fence_chain/signal_backward
> >>>> <6> dma-buf: Running dma_fence_chain/wait_forward
> >>>> <6> dma-buf: Running dma_fence_chain/wait_backward
> >>>> <0> watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [dmabuf:2263]
> >>>> Panic#1 Part6
> >>>> <4> irq event stamp: 415735
> >>>> <4> hardirqs last enabled at (415734): [<ffffffff813d3a1b>]
> > handle_softirqs+0xab/0x4d0
> >>>> <4> hardirqs last disabled at (415735): [<ffffffff827c7e31>]
> > sysvec_apic_timer_interrupt+0x11/0xc0
> >>>> <4> softirqs last enabled at (415728): [<ffffffff813d3f8f>]
> > __irq_exit_rcu+0x13f/0x160
> >>>> <4> softirqs last disabled at (415733): [<ffffffff813d3f8f>]
> > __irq_exit_rcu+0x13f/0x160
> >>>> <4> CPU: 2 UID: 0 PID: 2263 Comm: dmabuf Not tainted 6.14.0-rc2-drm-
> > next_483-g7b91683e7de7+ #1
> >>>> <4> Hardware name: Intel corporation NUC6CAYS/NUC6CAYB, BIOS
> > AYAPLCEL.86A.0056.2018.0926.1100 09/26/2018
> >>>> <4> RIP: 0010:handle_softirqs+0xb1/0x4d0
> >>>> <4> RSP: 0018:ffffc90000154f60 EFLAGS: 00000246
> >>>> <4> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
> >>>> <4> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> >>>> <4> RBP: ffffc90000154fb8 R08: 0000000000000000 R09: 0000000000000000
> >>>> <4> R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000a
> >>>> <4> R13: 0000000000000200 R14: 0000000000000200 R15: 0000000000400100
> >>>> <4> FS: 000077521c5cd940(0000) GS:ffff888277900000(0000)
> > knlGS:0000000000000000
> >>>> Panic#1 Part5
> >>>> <4> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>> <4> CR2: 00005dbfee8c00c4 CR3: 0000000133d38000 CR4: 00000000003526f0
> >>>> <4> Call Trace:
> >>>> <4> <IRQ>
> >>>> <4> ? show_regs+0x6c/0x80
> >>>> <4> ? watchdog_timer_fn+0x247/0x2d0
> >>>> <4> ? __pfx_watchdog_timer_fn+0x10/0x10
> >>>> <4> ? __hrtimer_run_queues+0x1d0/0x420
> >>>> <4> ? hrtimer_interrupt+0x116/0x290
> >>>> <4> ? __sysvec_apic_timer_interrupt+0x70/0x1e0
> >>>> <4> ? sysvec_apic_timer_interrupt+0x47/0xc0
> >>>> <4> ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
> >>>> <4> ? handle_softirqs+0xb1/0x4d0
> >>>> <4> __irq_exit_rcu+0x13f/0x160
> >>>> <4> irq_exit_rcu+0xe/0x20
> >>>> <4> sysvec_irq_work+0xa0/0xc0
> >>>> <4> </IRQ>
> >>>> <4> <TASK>
> >>>> <4> asm_sysvec_irq_work+0x1b/0x20
> >>>> <4> RIP: 0010:_raw_spin_unlock_irqrestore+0x57/0x80
> >>>> <4> RSP: 0018:ffffc9000292b8f0 EFLAGS: 00000246
> >>>> <4> RAX: 0000000000000000 RBX: ffff88810f235480 RCX: 0000000000000000
> >>>> <4> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> >>>> <4> RBP: ffffc9000292b900 R08: 0000000000000000 R09: 0000000000000000
> >>>> <4> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000246
> >>>> <4> R13: 0000000000000000 R14: 0000000000000246 R15: 000000000003828c
> >>>> Panic#1 Part4
> >>>> <4> dma_fence_signal+0x49/0xb0
> >>>> <4> wait_backward+0xf8/0x140 [dmabuf_selftests]
> >>>> <4> __subtests+0x75/0x1f0 [dmabuf_selftests]
> >>>> <4> dma_fence_chain+0x94/0xe0 [dmabuf_selftests]
> >>>> <4> st_init+0x6a/0xff0 [dmabuf_selftests]
> >>>> <4> ? __pfx_st_init+0x10/0x10 [dmabuf_selftests]
> >>>> <4> do_one_initcall+0x79/0x400
> >>>> <4> do_init_module+0x97/0x2a0
> >>>> <4> load_module+0x2c23/0x2f60
> >>>> <4> init_module_from_file+0x97/0xe0
> >>>> <4> ? init_module_from_file+0x97/0xe0
> >>>> <4> idempotent_init_module+0x134/0x350
> >>>> <4> __x64_sys_finit_module+0x77/0x100
> >>>> <4> x64_sys_call+0x1f37/0x2650
> >>>> <4> do_syscall_64+0x91/0x180
> >>>> <4> ? trace_hardirqs_off+0x5d/0xe0
> >>>> <4> ? syscall_exit_to_user_mode+0x95/0x260
> >>>> <4> ? do_syscall_64+0x9d/0x180
> >>>> <4> ? do_syscall_64+0x9d/0x180
> >>>> <4> ? irqentry_exit+0x77/0xb0
> >>>> <4> ? sysvec_apic_timer_interrupt+0x57/0xc0
> >>>> <4> entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >>>> <4> RIP: 0033:0x77521e72725d
> >>>>
> >>>>
> >>>> drivers/dma-buf/st-dma-fence-chain.c | 14 +++++++++++---
> >>>> 1 file changed, 11 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/drivers/dma-buf/st-dma-fence-chain.c b/drivers/dma-buf/st-
> > dma-fence-chain.c
> >>>> index ed4b323886e4..328a66ed59e5 100644
> >>>> --- a/drivers/dma-buf/st-dma-fence-chain.c
> >>>> +++ b/drivers/dma-buf/st-dma-fence-chain.c
> >>>> @@ -505,6 +505,7 @@ static int signal_forward(void *arg)
> >>>>
> >>>> for (i = 0; i < fc.chain_length; i++) {
> >>>> dma_fence_signal(fc.fences[i]);
> >>>> + cond_resched();
> >>>>
> >>>> if (!dma_fence_is_signaled(fc.chains[i])) {
> >>>> pr_err("chain[%d] not signaled!\n", i);
> >>>> @@ -537,6 +538,7 @@ static int signal_backward(void *arg)
> >>>>
> >>>> for (i = fc.chain_length; i--; ) {
> >>>> dma_fence_signal(fc.fences[i]);
> >>>> + cond_resched();
> >>>>
> >>>> if (i > 0 && dma_fence_is_signaled(fc.chains[i])) {
> >>>> pr_err("chain[%d] is signaled!\n", i);
> >>>> @@ -587,8 +589,10 @@ static int wait_forward(void *arg)
> >>>> get_task_struct(tsk);
> >>>> yield_to(tsk, true);
> >>>>
> >>>> - for (i = 0; i < fc.chain_length; i++)
> >>>> + for (i = 0; i < fc.chain_length; i++) {
> >>>> dma_fence_signal(fc.fences[i]);
> >>>> + cond_resched();
> >>>> + }
> >>>>
> >>>> err = kthread_stop_put(tsk);
> >>>>
> >>>> @@ -616,8 +620,10 @@ static int wait_backward(void *arg)
> >>>> get_task_struct(tsk);
> >>>> yield_to(tsk, true);
> >>>>
> >>>> - for (i = fc.chain_length; i--; )
> >>>> + for (i = fc.chain_length; i--; ) {
> >>>> dma_fence_signal(fc.fences[i]);
> >>>> + cond_resched();
> >>>> + }
> >>>>
> >>>> err = kthread_stop_put(tsk);
> >>>>
> >>>> @@ -663,8 +669,10 @@ static int wait_random(void *arg)
> >>>> get_task_struct(tsk);
> >>>> yield_to(tsk, true);
> >>>>
> >>>> - for (i = 0; i < fc.chain_length; i++)
> >>>> + for (i = 0; i < fc.chain_length; i++) {
> >>>> dma_fence_signal(fc.fences[i]);
> >>>> + cond_resched();
> >>>> + }
> >>>>
> >>>> err = kthread_stop_put(tsk);
> >>>>
> >>
> >>
> >
> >
> >
> >
>
>
More information about the Intel-gfx
mailing list