[RFC] dma-buf/fence: avoid holding lock while calling cb

Sun Oct 16 17:38:46 UTC 2016

On Sun, Oct 16, 2016 at 12:03:53PM -0400, Rob Clark wrote:
> Currently with fence-array, we have a potential deadlock situation.  If we
> fence_add_callback() on an array-fence, the array-fence's lock is aquired
> first, and in it's ->enable_signaling() callback, it will install cb's on
> it's array-member fences, so the array-member's lock is acquired second.
> 
> But in the signal path, the array-member's lock is acquired first, and the
> array-fence's lock acquired second.
> 
> One approach to deal with this is avoid holding the fence's lock when
> calling the cb.  It is a bit intrusive and I haven't fixed up all the
> other drivers that call directly or indirectly fence_signal_locked().
> 
> I guess the other option would be introduce a work-queue for array-fence?
> Or??

Not keen on any yet, still hoping for a gold ore.

> lockdep splat:
> 
>  ======================================================
> [ INFO: possible circular locking dependency detected ]
> 4.7.0-rc7+ #489 Not tainted
> -------------------------------------------------------
> surfaceflinger/2034 is trying to acquire lock:
>  (&(&array->lock)->rlock){......}, at: [<ffff00000858cddc>] fence_signal+0x5c/0xf8
> 
> but task is already holding lock:
>  (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (&(&obj->child_list_lock)->rlock){......}:
>        [<ffff000008108924>] __lock_acquire+0x173c/0x18d8
>        [<ffff000008108e0c>] lock_acquire+0x4c/0x68
>        [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70
>        [<ffff00000858d05c>] fence_add_callback+0x3c/0x100
>        [<ffff00000858f100>] fence_array_enable_signaling+0x80/0x170
>        [<ffff00000858d0d8>] fence_add_callback+0xb8/0x100
>        [<ffff00000858f504>] sync_file_poll+0xd4/0xf0
>        [<ffff0000081fd3a0>] do_sys_poll+0x220/0x438
>        [<ffff0000081fd8d0>] SyS_ppoll+0x1b0/0x1d8
>        [<ffff000008084f30>] el0_svc_naked+0x24/0x28
> 
> -> #0 (&(&array->lock)->rlock){......}:
>        [<ffff000008104768>] print_circular_bug+0x80/0x2e0
>        [<ffff0000081089ac>] __lock_acquire+0x17c4/0x18d8
>        [<ffff000008108e0c>] lock_acquire+0x4c/0x68
>        [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70
>        [<ffff00000858cddc>] fence_signal+0x5c/0xf8
>        [<ffff00000858f268>] fence_array_cb_func+0x78/0x88
>        [<ffff00000858cb28>] fence_signal_locked+0x80/0xe0
>        [<ffff0000085903c8>] sw_sync_ioctl+0x2f8/0x3b0
>        [<ffff0000081faf6c>] do_vfs_ioctl+0xa4/0x790
>        [<ffff0000081fb6e4>] SyS_ioctl+0x8c/0xa0
>        [<ffff000008084f30>] el0_svc_naked+0x24/0x28
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(&(&obj->child_list_lock)->rlock);
>                                lock(&(&array->lock)->rlock);
>                                lock(&(&obj->child_list_lock)->rlock);
>   lock(&(&array->lock)->rlock);
> 
>  *** DEADLOCK ***
> 
> 1 lock held by surfaceflinger/2034:
>  #0:  (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0
> ---
>  drivers/dma-buf/fence.c      | 22 ++++++++++++++--------
>  drivers/dma-buf/sw_sync.c    |  2 +-
>  drivers/dma-buf/sync_debug.c | 16 +++++++++-------
>  include/linux/fence.h        |  6 +++---
>  4 files changed, 27 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c
> index cc05ddd..917f905 100644
> --- a/drivers/dma-buf/fence.c
> +++ b/drivers/dma-buf/fence.c
> @@ -63,9 +63,9 @@ EXPORT_SYMBOL(fence_context_alloc);
>   *
>   * Unlike fence_signal, this function must be called with fence->lock held.
>   */
> -int fence_signal_locked(struct fence *fence)
> +int fence_signal_locked(struct fence *fence, unsigned long flags)
>  {
> -	struct fence_cb *cur, *tmp;
> +	struct fence_cb *cur;
>  	int ret = 0;
>  
>  	lockdep_assert_held(fence->lock);
> @@ -88,9 +88,12 @@ int fence_signal_locked(struct fence *fence)
>  	} else
>  		trace_fence_signaled(fence);
>  
> -	list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {
> +	while (!list_empty(&fence->cb_list)) {
> +		cur = list_first_entry(&fence->cb_list, struct fence_cb, node);
>  		list_del_init(&cur->node);
> +		spin_unlock_irqrestore(fence->lock, flags);

We don't need to pass down flags here, as we don't have to change the
context of the lock (i.e. we just leave irqs disabled if they have been
by the caller). That spares us the issue of the flags being changed
here, and the changed flags not being propagated back.

We need only grab the list under the lock once, as once signaled no new
elements will be added to the list (after add_callback takes the lock it
checks for the signal).

i.e.

struct list_head cb_list;

list_splice_init(&fence->cb_list, &cb_list;
spin_unlock(fence->lock);

list_for_each_entry_safe(cur, tmp, &cb_list, node)
	cur->func(fence, cur);

spin_lock(fence->lock);

Though I'm not convinced that dropping the lock is going to be safe for
everyone, since that lock may be communal and guarding move than just
the cb_list, e.g. virtio_gpu_fence_event_process().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre