[PATCH] dma-buf: Take a breath during dma-fence-chain subtests

Janusz Krzysztofik janusz.krzysztofik at linux.intel.com
Tue Jul 8 08:56:41 UTC 2025


On Monday, 7 July 2025 15:14:11 CEST Christian König wrote:
> On 07.07.25 14:25, Janusz Krzysztofik wrote:
> > Hi Christian,
> > 
> > I've taken over that old issue and have a few questions to you.
> 
> Thanks a lot for that, something really fishy seems to be going on here.
> 
> > On Thursday, 27 February 2025 15:11:39 CEST Christian König wrote:
> ...
> >> Question is why is the test taking 26 (busy?) seconds to complete? That 
> > sounds really long even for a very old CPU.
> > 
> > The failing test case operates on a dma fence chain built of 4096 dma fences.   
> > Am I right that a single dma_fence_signal() call may result in up to 4096 
> > levels of nested sub-calls to dma_fence_chain_cb()?
> 
> No, that could only happen if the test case is broken.
> 
> See the test is to make sure that we don't overflow the kernel stack with many elements when enable_signaling is called.
> 
> The callback is always only installed on the first unsignaled fence. So when the last fence is signaled the dma_fence_chain_cb function it is called exactly once, installs itself on the second to last fence, waits for that one to signal and so on.
> 
> 
> What could be is that test case calls enable_signaling on each of the 4096 dma_fence_chain elements individually and so each of the elements registers a dma_fence_chain_cb function, but then the test case would be rather broken and should probably be fixed.
> 
> > The test case signals 
> > each fence of the chain in a loop, starting from the last one.  Then,  
> > dma_fence_chain_cb() is called 4096 * (4096 + 1) / 2 (an arithmetic series) ~= 
> > 8.4 milion times, isn't it?
> 
> When the test case indeed enables signaling on all 4096 elements then yes.
> 
> > 
> > On most powerful gen12 machines used in CI, that test case takes slightly less 
> > than 3s, on low end few years old machines -- ~ 10s.  Should we be surprised 
> > that it takes over 20s on the least powerful one (gen3 PineView Atom)?  And, 
> > while reproducing the issue, I've never seen any unrecoverable deadlocks.  It 
> > just takes time to complete the loop.
> > 
> > Does that address your doubts?
> 
> No, not really.
> 
> > Assuming the commit message will be extended and provide the above 
> > explanation, I can still imagine a few ways of "fixing" that issue.  We can 
> > just limit the chain length and still execute all those dma_fence_signal() 
> > calls without taking a breath, but why?  Or we can somehow measure expected 
> > processing speed before running the exercise and limit the chain size 
> > accordingly at runtime, which in turn seems an overcomplication to me.  Or, we 
> > can agree that there is no point in avoiding that process being potentially 
> > swapped out from the CPU and take the approach proposed by Nitin, perhaps 
> > limited to this particular problematic test case.  And, I can see 
> > cond_resched() still used in recent patches.
> > 
> > Would any of those options be acceptable for you?
> 
> No, if the test case really does what you describe here then the test is just broken and should potentially be fixed instead.
> 
> There is no reason to test enabling signaling each of the element in a loop. So there should be something like 4096 calls to the dma_fence_chain_cb function each jumping to the next unsignaled fence and re-installing the callback.

So how building a chain should look like in real use cases?  When a user 
builds a chained link of her fence with another fence then may she enable 
signaling on the new chain link?  If that other fence occurs a chain link then 
who should take care of disabling signaling on it so signaling is enabled only 
on the last link of the chain, not leading to a situation similar to what we 
have now in the test case?  IOW, what's a correct use pattern of 
dma_fence_chain?  I can't find that documented anywhere, neither in inline 
docs nor in commit descriptions.

Thanks,
Janusz

> 
> Regards,
> Christian.
> 
> > 
> > Thanks,
> > Janusz
> > 
> >>
> >> Do we maybe have an udelay() here which should have been an usleep() or 
> > similar?
> >>
> >> Regards,
> >> Christian.
> >>
> >>>
> >>> Andi
> >>>
> >>>> ---
> >>>> Hi,
> >>>>
> >>>> For reviewer reference, adding here watchdog issue seen on old machines
> >>>> during dma-fence-chain subtests testing. This log is retrieved from 
> > device
> >>>> pstore log while testing dam-buf at all-tests:
> >>>>
> >>>> dma-buf: Running dma_fence_chain
> >>>> Panic#1 Part7
> >>>> <6> sizeof(dma_fence_chain)=184
> >>>> <6> dma-buf: Running dma_fence_chain/sanitycheck
> >>>> <6> dma-buf: Running dma_fence_chain/find_seqno
> >>>> <6> dma-buf: Running dma_fence_chain/find_signaled
> >>>> <6> dma-buf: Running dma_fence_chain/find_out_of_order
> >>>> <6> dma-buf: Running dma_fence_chain/find_gap
> >>>> <6> dma-buf: Running dma_fence_chain/find_race
> >>>> <6> Completed 4095 cycles
> >>>> <6> dma-buf: Running dma_fence_chain/signal_forward
> >>>> <6> dma-buf: Running dma_fence_chain/signal_backward
> >>>> <6> dma-buf: Running dma_fence_chain/wait_forward
> >>>> <6> dma-buf: Running dma_fence_chain/wait_backward
> >>>> <0> watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [dmabuf:2263]
> >>>> Panic#1 Part6
> >>>> <4> irq event stamp: 415735
> >>>> <4> hardirqs last  enabled at (415734): [<ffffffff813d3a1b>] 
> > handle_softirqs+0xab/0x4d0
> >>>> <4> hardirqs last disabled at (415735): [<ffffffff827c7e31>] 
> > sysvec_apic_timer_interrupt+0x11/0xc0
> >>>> <4> softirqs last  enabled at (415728): [<ffffffff813d3f8f>] 
> > __irq_exit_rcu+0x13f/0x160
> >>>> <4> softirqs last disabled at (415733): [<ffffffff813d3f8f>] 
> > __irq_exit_rcu+0x13f/0x160
> >>>> <4> CPU: 2 UID: 0 PID: 2263 Comm: dmabuf Not tainted 6.14.0-rc2-drm-
> > next_483-g7b91683e7de7+ #1
> >>>> <4> Hardware name: Intel corporation NUC6CAYS/NUC6CAYB, BIOS 
> > AYAPLCEL.86A.0056.2018.0926.1100 09/26/2018
> >>>> <4> RIP: 0010:handle_softirqs+0xb1/0x4d0
> >>>> <4> RSP: 0018:ffffc90000154f60 EFLAGS: 00000246
> >>>> <4> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
> >>>> <4> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> >>>> <4> RBP: ffffc90000154fb8 R08: 0000000000000000 R09: 0000000000000000
> >>>> <4> R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000a
> >>>> <4> R13: 0000000000000200 R14: 0000000000000200 R15: 0000000000400100
> >>>> <4> FS:  000077521c5cd940(0000) GS:ffff888277900000(0000) 
> > knlGS:0000000000000000
> >>>> Panic#1 Part5
> >>>> <4> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>> <4> CR2: 00005dbfee8c00c4 CR3: 0000000133d38000 CR4: 00000000003526f0
> >>>> <4> Call Trace:
> >>>> <4>  <IRQ>
> >>>> <4>  ? show_regs+0x6c/0x80
> >>>> <4>  ? watchdog_timer_fn+0x247/0x2d0
> >>>> <4>  ? __pfx_watchdog_timer_fn+0x10/0x10
> >>>> <4>  ? __hrtimer_run_queues+0x1d0/0x420
> >>>> <4>  ? hrtimer_interrupt+0x116/0x290
> >>>> <4>  ? __sysvec_apic_timer_interrupt+0x70/0x1e0
> >>>> <4>  ? sysvec_apic_timer_interrupt+0x47/0xc0
> >>>> <4>  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
> >>>> <4>  ? handle_softirqs+0xb1/0x4d0
> >>>> <4>  __irq_exit_rcu+0x13f/0x160
> >>>> <4>  irq_exit_rcu+0xe/0x20
> >>>> <4>  sysvec_irq_work+0xa0/0xc0
> >>>> <4>  </IRQ>
> >>>> <4>  <TASK>
> >>>> <4>  asm_sysvec_irq_work+0x1b/0x20
> >>>> <4> RIP: 0010:_raw_spin_unlock_irqrestore+0x57/0x80
> >>>> <4> RSP: 0018:ffffc9000292b8f0 EFLAGS: 00000246
> >>>> <4> RAX: 0000000000000000 RBX: ffff88810f235480 RCX: 0000000000000000
> >>>> <4> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> >>>> <4> RBP: ffffc9000292b900 R08: 0000000000000000 R09: 0000000000000000
> >>>> <4> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000246
> >>>> <4> R13: 0000000000000000 R14: 0000000000000246 R15: 000000000003828c
> >>>> Panic#1 Part4
> >>>> <4> dma_fence_signal+0x49/0xb0
> >>>> <4> wait_backward+0xf8/0x140 [dmabuf_selftests]
> >>>> <4> __subtests+0x75/0x1f0 [dmabuf_selftests]
> >>>> <4> dma_fence_chain+0x94/0xe0 [dmabuf_selftests]
> >>>> <4> st_init+0x6a/0xff0 [dmabuf_selftests]
> >>>> <4> ? __pfx_st_init+0x10/0x10 [dmabuf_selftests]
> >>>> <4> do_one_initcall+0x79/0x400
> >>>> <4> do_init_module+0x97/0x2a0
> >>>> <4> load_module+0x2c23/0x2f60
> >>>> <4> init_module_from_file+0x97/0xe0
> >>>> <4> ? init_module_from_file+0x97/0xe0
> >>>> <4> idempotent_init_module+0x134/0x350
> >>>> <4> __x64_sys_finit_module+0x77/0x100
> >>>> <4> x64_sys_call+0x1f37/0x2650
> >>>> <4> do_syscall_64+0x91/0x180
> >>>> <4> ? trace_hardirqs_off+0x5d/0xe0
> >>>> <4> ? syscall_exit_to_user_mode+0x95/0x260
> >>>> <4> ? do_syscall_64+0x9d/0x180
> >>>> <4> ? do_syscall_64+0x9d/0x180
> >>>> <4> ? irqentry_exit+0x77/0xb0
> >>>> <4> ? sysvec_apic_timer_interrupt+0x57/0xc0
> >>>> <4> entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >>>> <4> RIP: 0033:0x77521e72725d
> >>>>
> >>>>
> >>>>  drivers/dma-buf/st-dma-fence-chain.c | 14 +++++++++++---
> >>>>  1 file changed, 11 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/drivers/dma-buf/st-dma-fence-chain.c b/drivers/dma-buf/st-
> > dma-fence-chain.c
> >>>> index ed4b323886e4..328a66ed59e5 100644
> >>>> --- a/drivers/dma-buf/st-dma-fence-chain.c
> >>>> +++ b/drivers/dma-buf/st-dma-fence-chain.c
> >>>> @@ -505,6 +505,7 @@ static int signal_forward(void *arg)
> >>>>  
> >>>>  	for (i = 0; i < fc.chain_length; i++) {
> >>>>  		dma_fence_signal(fc.fences[i]);
> >>>> +		cond_resched();
> >>>>  
> >>>>  		if (!dma_fence_is_signaled(fc.chains[i])) {
> >>>>  			pr_err("chain[%d] not signaled!\n", i);
> >>>> @@ -537,6 +538,7 @@ static int signal_backward(void *arg)
> >>>>  
> >>>>  	for (i = fc.chain_length; i--; ) {
> >>>>  		dma_fence_signal(fc.fences[i]);
> >>>> +		cond_resched();
> >>>>  
> >>>>  		if (i > 0 && dma_fence_is_signaled(fc.chains[i])) {
> >>>>  			pr_err("chain[%d] is signaled!\n", i);
> >>>> @@ -587,8 +589,10 @@ static int wait_forward(void *arg)
> >>>>  	get_task_struct(tsk);
> >>>>  	yield_to(tsk, true);
> >>>>  
> >>>> -	for (i = 0; i < fc.chain_length; i++)
> >>>> +	for (i = 0; i < fc.chain_length; i++) {
> >>>>  		dma_fence_signal(fc.fences[i]);
> >>>> +		cond_resched();
> >>>> +	}
> >>>>  
> >>>>  	err = kthread_stop_put(tsk);
> >>>>  
> >>>> @@ -616,8 +620,10 @@ static int wait_backward(void *arg)
> >>>>  	get_task_struct(tsk);
> >>>>  	yield_to(tsk, true);
> >>>>  
> >>>> -	for (i = fc.chain_length; i--; )
> >>>> +	for (i = fc.chain_length; i--; ) {
> >>>>  		dma_fence_signal(fc.fences[i]);
> >>>> +		cond_resched();
> >>>> +	}
> >>>>  
> >>>>  	err = kthread_stop_put(tsk);
> >>>>  
> >>>> @@ -663,8 +669,10 @@ static int wait_random(void *arg)
> >>>>  	get_task_struct(tsk);
> >>>>  	yield_to(tsk, true);
> >>>>  
> >>>> -	for (i = 0; i < fc.chain_length; i++)
> >>>> +	for (i = 0; i < fc.chain_length; i++) {
> >>>>  		dma_fence_signal(fc.fences[i]);
> >>>> +		cond_resched();
> >>>> +	}
> >>>>  
> >>>>  	err = kthread_stop_put(tsk);
> >>>>  
> >>
> >>
> > 
> > 
> > 
> > 
> 
> 






More information about the Intel-gfx mailing list