[PATCH 1/2] drm/i915/active: Serialize preallocation of idle barriers
Janusz Krzysztofik
janusz.krzysztofik at linux.intel.com
Fri Mar 3 13:00:57 UTC 2023
When we collect barriers for preallocating them, we reuse either idle or
non-idle ones, whichever we find. In case of non-idle barriers, we
depend on their successful deletion from their barrier tasks lists as an
indication of them not being claimed by another thread. However, in case
of idle barriers, we neither perform any similar checks nor take any
preventive countermeasures against unexpected races with other threads.
We may then end up adding the same barrier to two independent preallocated
lists, and then adding them twice back to the same engine's barrier tasks
list, thus effectively creating a loop of llist nodes. As a result,
searches through that barrier tasks llist start spinning indefinitely.
Occurrences of that issue were never observed on CI nor reported by users.
Deep analysis revealed a silent, most probably not intended workaround
actively breaking those loops by rebuilding barrier tasks llists in
reverse order inside our local implementation of llist node deletion. A
simple patch that replaces that rebuild with just an update of next
pointer of a node preceding the one to be deleted results in the race
popping up frequently [1]. There is a plan to apply that update as soon
as we have the race fixed.
For this patch, whenever an idle barrier is selected for preallocation,
set it up immediately as non-idle, i.e., mark it with our ERR_PTR(-EAGAIN)
barrier mark, so other threads are no longer able to claim it as not a
member of a barrier tasks list. Also, increase active count of its
composite tracker host immediately, as long as we still know it was idle.
Serialize that claim operation against other potential concurrent updates
of active fence pointer, and allocate a new node if the idle one we found
occurs claimed by another competing thread.
While being at it, fortify now still racy check for preallocated_barriers
llist being still empty when we populate it with collected proto-barriers
(assuming we need that check).
[1] https://patchwork.freedesktop.org/series/114591/
Fixes: 9ff33bbcda25 ("drm/i915: Reduce locking around i915_active_acquire_preallocate_barrier()")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
Cc: stable at vger.kernel.org # v5.10
---
drivers/gpu/drm/i915/i915_active.c | 50 +++++++++++++++++-------------
1 file changed, 29 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index a9fea115f2d26..b2f79f5c257a8 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -788,8 +788,13 @@ static struct active_node *reuse_idle_barrier(struct i915_active *ref, u64 idx)
* node kept alive (as we reuse before parking). We prefer to reuse
* completely idle barriers (less hassle in manipulating the llists),
* but otherwise any will do.
+ *
+ * We reuse the request field to mark this as being our proto-node.
*/
- if (ref->cache && is_idle_barrier(ref->cache, idx)) {
+ if (ref->cache && is_idle_barrier(ref->cache, idx) &&
+ !cmpxchg(__active_fence_slot(&ref->cache->base), NULL,
+ ERR_PTR(-EAGAIN))) {
+ __i915_active_acquire(ref);
p = &ref->cache->node;
goto match;
}
@@ -800,8 +805,12 @@ static struct active_node *reuse_idle_barrier(struct i915_active *ref, u64 idx)
struct active_node *node =
rb_entry(p, struct active_node, node);
- if (is_idle_barrier(node, idx))
+ if (is_idle_barrier(node, idx) &&
+ !cmpxchg(__active_fence_slot(&node->base), NULL,
+ ERR_PTR(-EAGAIN))) {
+ __i915_active_acquire(ref);
goto match;
+ }
prev = p;
if (node->timeline < idx)
@@ -827,8 +836,12 @@ static struct active_node *reuse_idle_barrier(struct i915_active *ref, u64 idx)
if (node->timeline < idx)
continue;
- if (is_idle_barrier(node, idx))
+ if (is_idle_barrier(node, idx) &&
+ !cmpxchg(__active_fence_slot(&node->base), NULL,
+ ERR_PTR(-EAGAIN))) {
+ __i915_active_acquire(ref);
goto match;
+ }
/*
* The list of pending barriers is protected by the
@@ -889,29 +902,24 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref,
if (!node)
goto unwind;
- RCU_INIT_POINTER(node->base.fence, NULL);
+ /* Mark this as being our unconnected proto-node */
+ RCU_INIT_POINTER(node->base.fence, ERR_PTR(-EAGAIN));
node->base.cb.func = node_retire;
node->timeline = idx;
node->ref = ref;
- }
-
- if (!i915_active_fence_isset(&node->base)) {
- /*
- * Mark this as being *our* unconnected proto-node.
- *
- * Since this node is not in any list, and we have
- * decoupled it from the rbtree, we can reuse the
- * request to indicate this is an idle-barrier node
- * and then we can use the rb_node and list pointers
- * for our tracking of the pending barrier.
- */
- RCU_INIT_POINTER(node->base.fence, ERR_PTR(-EAGAIN));
- node->base.cb.node.prev = (void *)engine;
__i915_active_acquire(ref);
+ } else {
+ GEM_BUG_ON(rcu_access_pointer(node->base.fence) !=
+ ERR_PTR(-EAGAIN));
}
- GEM_BUG_ON(rcu_access_pointer(node->base.fence) != ERR_PTR(-EAGAIN));
- GEM_BUG_ON(barrier_to_engine(node) != engine);
+ /*
+ * Since this node is not in any list, we have decoupled it
+ * from the rbtree, and we reuse the request to indicate
+ * this is a barrier node, then we can use list pointers
+ * for our tracking of the pending barrier.
+ */
+ node->base.cb.node.prev = (void *)engine;
first = barrier_to_ll(node);
first->next = prev;
if (!last)
@@ -920,7 +928,7 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref,
}
GEM_BUG_ON(!llist_empty(&ref->preallocated_barriers));
- llist_add_batch(first, last, &ref->preallocated_barriers);
+ GEM_BUG_ON(!llist_add_batch(first, last, &ref->preallocated_barriers));
return 0;
--
2.25.1
More information about the Intel-gfx-trybot
mailing list