[PATCH] dma-buf/fence-array: fix deadlock in fence-array

Rob Clark robdclark at gmail.com
Mon Oct 17 18:40:12 UTC 2016


Currently with fence-array, we have a potential deadlock situation.  If we
fence_add_callback() on an array-fence, the array-fence's lock is acquired
first, and in it's ->enable_signaling() callback, it will install cb's on
it's array-member fences, so the array-member's lock is acquired second.

But in the signal path, the array-member's lock is acquired first, and the
array-fence's lock acquired second.

To solve that, always enabling signaling up-front (in the fence_array
constructor) without the fence_array's lock held.

lockdep splat:

 ======================================================
[ INFO: possible circular locking dependency detected ]
4.7.0-rc7+ #489 Not tainted
-------------------------------------------------------
surfaceflinger/2034 is trying to acquire lock:
 (&(&array->lock)->rlock){......}, at: [<ffff00000858cddc>] fence_signal+0x5c/0xf8

but task is already holding lock:
 (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&(&obj->child_list_lock)->rlock){......}:
       [<ffff000008108924>] __lock_acquire+0x173c/0x18d8
       [<ffff000008108e0c>] lock_acquire+0x4c/0x68
       [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70
       [<ffff00000858d05c>] fence_add_callback+0x3c/0x100
       [<ffff00000858f100>] fence_array_enable_signaling+0x80/0x170
       [<ffff00000858d0d8>] fence_add_callback+0xb8/0x100
       [<ffff00000858f504>] sync_file_poll+0xd4/0xf0
       [<ffff0000081fd3a0>] do_sys_poll+0x220/0x438
       [<ffff0000081fd8d0>] SyS_ppoll+0x1b0/0x1d8
       [<ffff000008084f30>] el0_svc_naked+0x24/0x28

-> #0 (&(&array->lock)->rlock){......}:
       [<ffff000008104768>] print_circular_bug+0x80/0x2e0
       [<ffff0000081089ac>] __lock_acquire+0x17c4/0x18d8
       [<ffff000008108e0c>] lock_acquire+0x4c/0x68
       [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70
       [<ffff00000858cddc>] fence_signal+0x5c/0xf8
       [<ffff00000858f268>] fence_array_cb_func+0x78/0x88
       [<ffff00000858cb28>] fence_signal_locked+0x80/0xe0
       [<ffff0000085903c8>] sw_sync_ioctl+0x2f8/0x3b0
       [<ffff0000081faf6c>] do_vfs_ioctl+0xa4/0x790
       [<ffff0000081fb6e4>] SyS_ioctl+0x8c/0xa0
       [<ffff000008084f30>] el0_svc_naked+0x24/0x28

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&(&obj->child_list_lock)->rlock);
                               lock(&(&array->lock)->rlock);
                               lock(&(&obj->child_list_lock)->rlock);
  lock(&(&array->lock)->rlock);

 *** DEADLOCK ***

1 lock held by surfaceflinger/2034:
 #0:  (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0

Suggested-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
Signed-off-by: Rob Clark <robdclark at gmail.com>
---
 drivers/dma-buf/fence-array.c |  8 ++++++++
 drivers/dma-buf/fence.c       | 21 +++++++++++++++++++++
 include/linux/fence.h         |  1 +
 3 files changed, 30 insertions(+)

diff --git a/drivers/dma-buf/fence-array.c b/drivers/dma-buf/fence-array.c
index f1989fc..b5821e9 100644
--- a/drivers/dma-buf/fence-array.c
+++ b/drivers/dma-buf/fence-array.c
@@ -140,6 +140,14 @@ struct fence_array *fence_array_create(int num_fences, struct fence **fences,
 	atomic_set(&array->num_pending, signal_on_any ? 1 : num_fences);
 	array->fences = fences;
 
+	/* enable signaling without our lock being held while callbacks
+	 * are installed on the array-member fences.  Otherwise there is
+	 * a potential deadlock between enabling signaling (our lock
+	 * acquired first) and the array-members getting signalled (their
+	 * lock aquired first).
+	 */
+	fence_enable_sw_signaling_nolock(&array->base);
+
 	return array;
 }
 EXPORT_SYMBOL(fence_array_create);
diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c
index cc05ddd..ff6b410 100644
--- a/drivers/dma-buf/fence.c
+++ b/drivers/dma-buf/fence.c
@@ -219,6 +219,27 @@ void fence_enable_sw_signaling(struct fence *fence)
 EXPORT_SYMBOL(fence_enable_sw_signaling);
 
 /**
+ * fence_enable_sw_signaling_nolock - enable signaling on fence without taking lock
+ * @fence:	[in]	the fence to enable
+ *
+ * WARNING this is only safe to use when you know you have the only reference
+ * to the fence there is no possibility for race conditions as a result of
+ * calling ops->enable_signaling() without lock.  Ie. it is probably only safe
+ * to use from the fence's construction function.
+ */
+void fence_enable_sw_signaling_nolock(struct fence *fence)
+{
+	if (!test_and_set_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags) &&
+	    !test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
+		trace_fence_enable_signal(fence);
+
+		if (!fence->ops->enable_signaling(fence))
+			fence_signal(fence);
+	}
+}
+EXPORT_SYMBOL(fence_enable_sw_signaling_nolock);
+
+/**
  * fence_add_callback - add a callback to be called when the fence
  * is signaled
  * @fence:	[in]	the fence to wait on
diff --git a/include/linux/fence.h b/include/linux/fence.h
index 0d76305..076ac29 100644
--- a/include/linux/fence.h
+++ b/include/linux/fence.h
@@ -226,6 +226,7 @@ int fence_add_callback(struct fence *fence, struct fence_cb *cb,
 		       fence_func_t func);
 bool fence_remove_callback(struct fence *fence, struct fence_cb *cb);
 void fence_enable_sw_signaling(struct fence *fence);
+void fence_enable_sw_signaling_nolock(struct fence *fence);
 
 /**
  * fence_is_signaled_locked - Return an indication if the fence is signaled yet.
-- 
2.7.4



More information about the dri-devel mailing list