[Intel-gfx] [PATCH] drm/i915/execlists: Reclaim hanging virtual request

Chris Wilson chris at chris-wilson.co.uk
Tue Jan 21 10:09:27 UTC 2020


If we encounter a hang on a virtual engine, as we process the hang the
request may already have been moved back to the virtual engine (we are
processing the hang on the physical engine). We need to reclaim the
request from the virtual engine so that the locking is consistent and
local to the real engine on which we will hold the request for error
state capturing.

Fixes: 748317386afb ("drm/i915/execlists: Offline error capture")
Testcase: igt/gem_exec_balancer/hang
Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 3a30767ff0c4..a0acf1898c1e 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2582,6 +2582,26 @@ static void execlists_capture(struct intel_engine_cs *engine)
 	cap->rq = active_request(cap->rq->context->timeline, cap->rq);
 	GEM_BUG_ON(!cap->rq);
 
+	if (cap->rq->engine != engine) { /* preempted virtual engine */
+		struct virtual_engine *ve = to_virtual_engine(cap->rq->engine);
+		unsigned long flags;
+
+		/*
+		 * An unsubmitted request along a virtual engine will
+		 * remain on the active (this) engine until we are able
+		 * to process the context switch away (and so mark the
+		 * context as no longer in flight). That cannot have happened
+		 * yet, otherwise we would not be hanging!
+		 */
+		spin_lock_irqsave(&ve->base.active.lock, flags);
+		GEM_BUG_ON(intel_context_inflight(cap->rq->context) != engine);
+		GEM_BUG_ON(ve->request != cap->rq);
+		ve->request = NULL;
+		spin_unlock_irqrestore(&ve->base.active.lock, flags);
+
+		cap->rq->engine = engine;
+	}
+
 	/*
 	 * Remove the request from the execlists queue, and take ownership
 	 * of the request. We pass it to our worker who will _slowly_ compress
-- 
2.25.0



More information about the Intel-gfx mailing list