[Intel-gfx] [CI] drm/i915/execlists: Workaround switching back to a complete context
Chris Wilson
chris at chris-wilson.co.uk
Fri Mar 27 20:14:33 UTC 2020
In what seems remarkably similar to the w/a required to not reload an
idle context with HEAD==TAIL, it appears we must prevent the HW from
switching to an idle context in ELSP[1], while simultaneously trying to
preempt the HW to run another context and a continuation of the idle
context (which is no longer idle).
We can achieve this by preventing the context from completing while we
reload a new ELSP (by applying ring_set_paused(1) across the whole of
dequeue), except this eventually fails due to a lite-restore into a
waiting semaphore does not generate an ACK. Instead, we try to avoid
making the GPU do anything too challenging and not submit a new ELSP
while the interrupts + CSB events appear to have fallen behind the
completed contexts. We expect it to catch up shortly so we queue another
tasklet execution and hope for the best.
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1501
Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
---
drivers/gpu/drm/i915/gt/intel_lrc.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index b12355048501..5f17ece07858 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1915,11 +1915,26 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
* of trouble.
*/
active = READ_ONCE(execlists->active);
- while ((last = *active) && i915_request_completed(last))
- active++;
- if (last) {
+ /*
+ * In theory we can skip over completed contexts that have not
+ * yet been processed by events (as those events are in flight):
+ *
+ * while ((last = *active) && i915_request_completed(last))
+ * active++;
+ *
+ * However, the GPU is cannot handle this as it will ultimately
+ * find itself trying to jump back into a context it has just
+ * completed and barf.
+ */
+
+ if ((last = *active)) {
if (need_preempt(engine, last, rb)) {
+ if (i915_request_completed(last)) {
+ tasklet_hi_schedule(&execlists->tasklet);
+ return;
+ }
+
ENGINE_TRACE(engine,
"preempting last=%llx:%lld, prio=%d, hint=%d\n",
last->fence.context,
@@ -1947,6 +1962,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
last = NULL;
} else if (need_timeslice(engine, last) &&
timer_expired(&engine->execlists.timer)) {
+ if (i915_request_completed(last)) {
+ tasklet_hi_schedule(&execlists->tasklet);
+ return;
+ }
+
ENGINE_TRACE(engine,
"expired last=%llx:%lld, prio=%d, hint=%d\n",
last->fence.context,
--
2.20.1
More information about the Intel-gfx
mailing list