[Intel-gfx] [PATCH 06/20] drm/i915/gt: Remove virtual breadcrumb before transfer
Chris Wilson
chris at chris-wilson.co.uk
Mon Dec 7 19:38:10 UTC 2020
The issue with stale virtual breadcrumbs remain. Now we have the problem
that if the irq-signaler is still referencing the stale breadcrumb as we
transfer it to a new sibling, the list becomes spaghetti. This is a very
small window, but that doesn't stop it being hit infrequently. To
prevent the lists being tangled (the iterator starting on one engine's
b->signalers but walking onto another list), always decouple the virtual
breadcrumb on schedule-out and make sure that the walker has stepped out
of the lists.
Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
---
drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 5 +++--
drivers/gpu/drm/i915/gt/intel_lrc.c | 15 +++++++++++++++
2 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index 00918300f53f..63900edbde88 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -454,15 +454,16 @@ void i915_request_cancel_breadcrumb(struct i915_request *rq)
{
struct intel_breadcrumbs *b = READ_ONCE(rq->engine)->breadcrumbs;
struct intel_context *ce = rq->context;
+ unsigned long flags;
bool release;
if (!test_and_clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags))
return;
- spin_lock(&ce->signal_lock);
+ spin_lock_irqsave(&ce->signal_lock, flags);
list_del_rcu(&rq->signal_link);
release = remove_signaling_context(b, ce);
- spin_unlock(&ce->signal_lock);
+ spin_unlock_irqrestore(&ce->signal_lock, flags);
if (release)
intel_context_put(ce);
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 0b7e86240f3b..b3db16b2a5a4 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1388,6 +1388,21 @@ static inline void execlists_schedule_in(struct i915_request *rq, int idx)
static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
{
struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+ struct intel_engine_cs *engine = rq->engine;
+
+ /* Flush concurrent rcu iterators in signal_irq_work */
+ if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) {
+ /*
+ * After this point, the rq may be transferred to a new
+ * sibling, so before we clear ce->inflight make sure that
+ * the context has been removed from the b->signalers and
+ * furthermore we need to make sure that the concurrent
+ * iterator in signal_irq_work is no longer following
+ * ce->signal_link.
+ */
+ i915_request_cancel_breadcrumb(rq);
+ irq_work_sync(&engine->breadcrumbs->irq_work);
+ }
if (READ_ONCE(ve->request))
tasklet_hi_schedule(&ve->base.execlists.tasklet);
--
2.20.1
More information about the Intel-gfx
mailing list