[Intel-gfx] [RFCv2 07/12] drm/i915: Fake lost context interrupts through forced CSB check.

Tue Jul 21 06:58:50 PDT 2015

A recurring issue during long-duration operations testing of concurrent
rendering tasks with intermittent hangs is that context completion interrupts
following engine resets are sometimes lost. This becomes a real problem since
the hardware might have completed a previously hung context following a
per-engine hang recovery and then gone idle somehow without sending an
interrupt telling the driver about this. At this point the driver would be
stuck waiting for context completion, thinking that the context is still active,
even though the hardware would be idle and waiting for more work.

The way this is solved is by periodically checking for context submission
status inconsistencies. What this means is that the ID of the currently running
context on a given engine is compared against the context ID in the
EXECLIST_STATUS register of the respective engine. If the two do not match and
if the state does not change over time it is assumed that an interrupt was
missed and that the driver is now stuck in an inconsistent state.

Following the decision that the driver and the hardware are irreversibly stuck
in an inconsistent state on a certain engine, the presumably lost interrupt is
faked by simply calling the execlist interrupt handler from a non-interrupt
context. Even though interrupts might be lost that does not mean that the
hardware does not always update the context status buffer (CSB) when
appropriate, which means that any context state transitions would be captured
there regardless of the interrupt being sent or not. By faking the lost
interrupt the interrupt handler could act on the outstanding context status
transition events in the CSB, e.g. a context completion event. In the case
where the hardware would be idle but the driver would be waiting for
completion, faking an interrupt and finding a context completion status event
would cause the driver to remove the currently active request from the execlist
queue and go idle - thereby reestablishing a consistent context submission
status between the hardware and the driver.

The way this is implemented is that the hang checker will always keep alive as
long as there is outstanding work. Even if the enable_hangcheck flag is
disabled one part of the hang checker will always keep alive and reschedule
itself, only to scan for inconsistent context submission states on all engines.
As long as the context submission status of the currently running context on a
given engine is consistent the hang checker works as normal and schedules hang
recoveries as expected. If the status is not consistent no hang recoveries will
be scheduled since no context resubmission will be possible anyway, so there is
no point in trying until the status becomes consistent again. Of course, if
enough hangs on the same engine are detected without any change in consistency
the hang checker will go straight for the full GPU reset so there is no chance
of getting stuck in this state.

It's worth keeping in mind that the watchdog timeout hang detection mechanism
relies entirely on the per-engine hang recovery path. So if we have an
inconsistent context submission status on the engine that the watchdog timeout
has detected a hang there is no way to recover from that hang if the period
hangchecker is turned off since the per-engine hang recovery cannot do its
final context resubmission if the context submission status is inconsistent.
That's why we need to make sure that there is always a thread alive that keeps
an eye out for inconsistent context submission states, not only for the
periodic hang checker but also for watchdog timeout.

Finally, since a non-interrupt context thread could end up in the interrupt
handler as part of the forced CSB checking there's the chance of a race
condition between the interrupt handler and the ring init code since both
update ring->next_context_status_buffer. Therefore we've had to update the
interrupt handler so that it grabs the execlist spinlock before updating
the variable. We've also had to make sure that the ring init code
grabs the execlist spinlock before initing this variable.

* v2: (Chris Wilson)
Remove context submission status consistency pre-check from
i915_hangcheck_elapsed() and turn it into a pre-check to per-engine reset.

The following describes the change in philosphy in how context submission state
inconsistencies are detected:

Previously we would let the periodic hang checker ensure that there were no
context submission status inconsistencies on any engine, at any point. If an
inconsistency was detected in the per-engine hang recovery path we would back
off and defer to the next hang check since per-engine hang recovery is not
effective during inconsistent context submission states.

What we do in this new version is to move the consistency pre-check from the
hang checker to the earliest point in the per-engine hang recovery path. If we
detect an inconsistency at that point we fake a potentially lost context event
interrupt by forcing a CSB check. If there are outstanding events in the CSB
these will be acted upon and hopefully that will bring the driver up to speed
with the hardware. If the CSB check did not amount to anything it is concluded
that the inconsistency is unresolvable and the per-engine hang recovery fails
and promotes to full GPU reset instead.

In the hang checker-based consistency checking we would check the inconsistency
for a number of times to make sure the detected state was stable before
attempting to rectify the situation. This is possible since hang checking
is a recurring event. Having moved the consistency checking to the recovery
path instead (i.e. a one-time, fire & forget-style event) it is assumed
that the hang detection that brought on the hang recovery has detected a
stable hang and therefore, if an inconsistency is detected at that point,
the inconsistency must be stable and not the result of a momentary context
state transition. Therefore, unlike in the hang checker case, at the very
first indication of an inconsistent context submission status the interrupt
is faked speculatively. If outstanding CSB events are found it is
determined that the hang was in fact just a context submission status
inconsistency and no hang recovery is done. If the inconsistency cannot be
resolved the per-engine hang recovery is failed and the hang is promoted to
full GPU reset instead.

Signed-off-by: Tomas Elf <tomas.elf at intel.com>
Cc: Chris Wilson <chris at chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c      | 118 ++++++++++++++++++++++++++---------
 drivers/gpu/drm/i915/i915_irq.c      |  32 ++--------
 drivers/gpu/drm/i915/intel_lrc.c     |  67 ++++++++++++++++++--
 drivers/gpu/drm/i915/intel_lrc.h     |   2 +-
 drivers/gpu/drm/i915/intel_lrc_tdr.h |   3 +
 5 files changed, 159 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c7ba64e..2ccb2e8 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -917,6 +917,71 @@ int i915_reset(struct drm_device *dev)
 	return 0;
 }
 
+static bool i915_gem_reset_engine_CSSC_precheck(
+		struct drm_i915_private *dev_priv,
+		struct intel_engine_cs *engine,
+		struct drm_i915_gem_request **req,
+		int *ret)
+{
+	bool precheck_ok = true;
+	enum context_submission_status status;
+
+	WARN_ON(!ret);
+
+	*ret = 0;
+
+	status = intel_execlists_TDR_get_current_request(engine, req);
+
+	/*
+	 * If the hardware and driver states do not coincide
+	 * or if there for some reason is no current context
+	 * in the process of being submitted then bail out and
+	 * try again. Do not proceed unless we have reliable
+	 * current context state information.
+	 */
+	if (status == CONTEXT_SUBMISSION_STATUS_NONE_SUBMITTED) {
+		/*
+		 * No work in flight. This state is possible to get
+		 * into if calling the error handler directly from
+		 * debugfs.  Just do early exit and forget it happened.
+		 */
+		WARN(1, "No work in flight! Aborting recovery on %s\n",
+			engine->name);
+
+		 precheck_ok = false;
+		 *ret = 0;
+
+	} else if (status == CONTEXT_SUBMISSION_STATUS_INCONSISTENT) {
+		if (!intel_execlists_TDR_force_CSB_check(dev_priv, engine)) {
+			/*
+			 * Context submission state is inconsistent and
+			 * faking a context event IRQ did not help.
+			 * Fail and promote to higher level of
+			 * recovery!
+			 */
+			precheck_ok = false;
+			*ret = -EINVAL;
+		} else {
+			/*
+			 * Rectifying the inconsistent context
+			 * submission status helped! No reset required,
+			 * just exit and move on!
+			 */
+			 precheck_ok = false;
+			 *ret = 0;
+		}
+
+	} else if (status != CONTEXT_SUBMISSION_STATUS_OK) {
+		WARN(1, "Unexpected context submission status (%u) on %s\n",
+			status, engine->name);
+
+		precheck_ok = false;
+		*ret = -EINVAL;
+	}
+
+	return precheck_ok;
+}
+
 /**
  * i915_reset_engine - reset GPU engine after a hang
  * @engine: engine to reset
@@ -958,20 +1023,17 @@ int i915_reset_engine(struct intel_engine_cs *engine)
 	i915_gem_reset_ring_status(dev_priv, engine);
 
 	if (i915.enable_execlists) {
-		enum context_submission_status status =
-			intel_execlists_TDR_get_current_request(engine, NULL);
-
 		/*
-		 * If the hardware and driver states do not coincide
-		 * or if there for some reason is no current context
-		 * in the process of being submitted then bail out and
-		 * try again. Do not proceed unless we have reliable
-		 * current context state information.
+		 * Check context submission status consistency (CSSC) before
+		 * moving on. If the driver and hardware have different
+		 * opinions about what is going on just fail and escalate to a
+		 * higher form of hang recovery.
 		 */
-		if (status != CONTEXT_SUBMISSION_STATUS_OK) {
-			ret = -EAGAIN;
+		 if (!i915_gem_reset_engine_CSSC_precheck(dev_priv,
+							  engine,
+							  NULL,
+							  &ret))
 			goto reset_engine_error;
-		}
 	}
 
 	ret = intel_ring_disable(engine);
@@ -981,27 +1043,21 @@ int i915_reset_engine(struct intel_engine_cs *engine)
 	}
 
 	if (i915.enable_execlists) {
-		enum context_submission_status status;
-		bool inconsistent;
-
-		status = intel_execlists_TDR_get_current_request(engine,
-				&current_request);
-
-		inconsistent = (status != CONTEXT_SUBMISSION_STATUS_OK);
-		if (inconsistent) {
-			/*
-			 * If we somehow have reached this point with
-			 * an inconsistent context submission status then
-			 * back out of the previously requested reset and
-			 * retry later.
-			 */
-			WARN(inconsistent,
-			     "Inconsistent context status on %s: %u\n",
-			     engine->name, status);
-
-			ret = -EAGAIN;
+		/*
+		 * Get a hold of the currently executing context.
+		 *
+		 * Context submission status consistency is done implicitly so
+		 * we might as well check it post-engine disablement since we
+		 * get that option for free. Also, it's conceivable that the
+		 * context submission state might have changed as part of the
+		 * reset request on gen8+ so it's not completely devoid of
+		 * value to do this.
+		 */
+		 if (!i915_gem_reset_engine_CSSC_precheck(dev_priv,
+							  engine,
+							  &current_request,
+							  &ret))
 			goto reenable_reset_engine_error;
-		}
 	}
 
 	/* Sample the current ring head position */
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 5672a2c..22dd96c 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -36,6 +36,7 @@
 #include "i915_drv.h"
 #include "i915_trace.h"
 #include "intel_drv.h"
+#include "intel_lrc_tdr.h"
 
 /**
  * DOC: interrupt handling
@@ -1286,7 +1287,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 			ret = IRQ_HANDLED;
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT))
-				intel_lrc_irq_handler(&dev_priv->ring[RCS]);
+				intel_lrc_irq_handler(&dev_priv->ring[RCS], true);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[RCS]);
 			if (tmp & (GT_GEN8_RCS_WATCHDOG_INTERRUPT << GEN8_RCS_IRQ_SHIFT)) {
@@ -1303,7 +1304,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 			}
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT))
-				intel_lrc_irq_handler(&dev_priv->ring[BCS]);
+				intel_lrc_irq_handler(&dev_priv->ring[BCS], true);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[BCS]);
 		} else
@@ -1317,7 +1318,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 			ret = IRQ_HANDLED;
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT))
-				intel_lrc_irq_handler(&dev_priv->ring[VCS]);
+				intel_lrc_irq_handler(&dev_priv->ring[VCS], true);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[VCS]);
 			if (tmp & (GT_GEN8_VCS_WATCHDOG_INTERRUPT << GEN8_VCS1_IRQ_SHIFT)) {
@@ -1334,7 +1335,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 			}
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT))
-				intel_lrc_irq_handler(&dev_priv->ring[VCS2]);
+				intel_lrc_irq_handler(&dev_priv->ring[VCS2], true);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[VCS2]);
 			if (tmp & (GT_GEN8_VCS_WATCHDOG_INTERRUPT << GEN8_VCS2_IRQ_SHIFT)) {
@@ -1360,7 +1361,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 			ret = IRQ_HANDLED;
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT))
-				intel_lrc_irq_handler(&dev_priv->ring[VECS]);
+				intel_lrc_irq_handler(&dev_priv->ring[VECS], true);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[VECS]);
 		} else
@@ -2381,27 +2382,6 @@ static void i915_error_work_func(struct work_struct *work)
 
 			ret = i915_reset_engine(ring);
 
-			/*
-			 * Execlist mode only:
-			 *
-			 * -EAGAIN means that between detecting a hang (and
-			 * also determining that the currently submitted
-			 * context is stable and valid) and trying to recover
-			 * from the hang the current context changed state.
-			 * This means that we are probably not completely hung
-			 * after all. Just fail and retry by exiting all the
-			 * way back and wait for the next hang detection. If we
-			 * have a true hang on our hands then we will detect it
-			 * again, otherwise we will continue like nothing
-			 * happened.
-			 */
-			if (ret == -EAGAIN) {
-				DRM_ERROR("Reset of %s aborted due to " \
-					  "change in context submission " \
-					  "state - retrying!", ring->name);
-				ret = 0;
-			}
-
 			if (ret) {
 				DRM_ERROR("Reset of %s failed! (%d)", ring->name, ret);
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ff9d27cb..476eff0 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -672,7 +672,7 @@ static bool execlists_check_remove_request(struct intel_engine_cs *ring,
  * Check the unread Context Status Buffers and manage the submission of new
  * contexts to the ELSP accordingly.
  */
-void intel_lrc_irq_handler(struct intel_engine_cs *ring)
+int intel_lrc_irq_handler(struct intel_engine_cs *ring, bool do_lock)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	u32 status_pointer;
@@ -684,13 +684,14 @@ void intel_lrc_irq_handler(struct intel_engine_cs *ring)
 
 	status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
 
+	if (do_lock)
+		spin_lock(&ring->execlist_lock);
+
 	read_pointer = ring->next_context_status_buffer;
 	write_pointer = status_pointer & 0x07;
 	if (read_pointer > write_pointer)
 		write_pointer += 6;
 
-	spin_lock(&ring->execlist_lock);
-
 	while (read_pointer < write_pointer) {
 		read_pointer++;
 		status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
@@ -716,13 +717,16 @@ void intel_lrc_irq_handler(struct intel_engine_cs *ring)
 	if (submit_contexts != 0)
 		execlists_context_unqueue(ring, false);
 
-	spin_unlock(&ring->execlist_lock);
-
 	WARN(submit_contexts > 2, "More than two context complete events?\n");
 	ring->next_context_status_buffer = write_pointer % 6;
 
+	if (do_lock)
+		spin_unlock(&ring->execlist_lock);
+
 	I915_WRITE(RING_CONTEXT_STATUS_PTR(ring),
 		   ((u32)ring->next_context_status_buffer & 0x07) << 8);
+
+	return submit_contexts;
 }
 
 static int execlists_context_queue(struct intel_engine_cs *ring,
@@ -1356,6 +1360,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	unsigned long flags;
 
 	I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
 	I915_WRITE(RING_HWSTAM(ring->mmio_base), 0xffffffff);
@@ -1364,7 +1369,11 @@ static int gen8_init_common_ring(struct intel_engine_cs *ring)
 		   _MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
 		   _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
 	POSTING_READ(RING_MODE_GEN7(ring));
+
+	spin_lock_irqsave(&ring->execlist_lock, flags);
 	ring->next_context_status_buffer = 0;
+	spin_unlock_irqrestore(&ring->execlist_lock, flags);
+
 	DRM_DEBUG_DRIVER("Execlists enabled for %s\n", ring->name);
 
 	i915_hangcheck_reinit(ring);
@@ -2586,3 +2595,51 @@ intel_execlists_TDR_get_current_request(struct intel_engine_cs *ring,
 
 	return status;
 }
+
+/**
+ * execlists_TDR_force_CSB_check() - check CSB manually to act on pending
+ * context status events.
+ *
+ * @dev_priv: ...
+ * @engine: engine whose CSB is to be checked.
+ *
+ * Return:
+ * 	True: Consistency restored.
+ * 	False: Failed trying to restore consistency.
+ *
+ * In case we missed a context event interrupt we can fake this interrupt by
+ * acting on pending CSB events manually by calling this function. This is
+ * normally what would happen in interrupt context but that does not prevent us
+ * from calling it from a user thread.
+ */
+bool intel_execlists_TDR_force_CSB_check(struct drm_i915_private *dev_priv,
+					 struct intel_engine_cs *engine)
+{
+	unsigned long flags;
+	bool hw_active;
+	int was_effective;
+
+	hw_active =
+		(I915_READ(RING_EXECLIST_STATUS(engine)) &
+			EXECLIST_STATUS_CURRENT_ACTIVE_ELEMENT_STATUS) ?
+				true : false;
+	if (hw_active) {
+		u32 hw_context;
+
+		hw_context = I915_READ(RING_EXECLIST_STATUS_CTX_ID(engine));
+		WARN(hw_active, "Context (%x) executing on %s - " \
+				"No need for faked IRQ!\n",
+				hw_context, engine->name);
+		return false;
+	}
+
+	spin_lock_irqsave(&engine->execlist_lock, flags);
+	if (!(was_effective = intel_lrc_irq_handler(engine, false)))
+		DRM_ERROR("Forced CSB check of %s ineffective!\n", engine->name);
+	spin_unlock_irqrestore(&engine->execlist_lock, flags);
+
+	wake_up_all(&engine->irq_queue);
+
+	return !!was_effective;
+}
+
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index d2f497c..6fae3c8 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -88,7 +88,7 @@ int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
 			       u64 exec_start, u32 dispatch_flags);
 u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
 
-void intel_lrc_irq_handler(struct intel_engine_cs *ring);
+int intel_lrc_irq_handler(struct intel_engine_cs *ring, bool do_lock);
 void intel_execlists_retire_requests(struct intel_engine_cs *ring);
 
 int intel_execlists_read_tail(struct intel_engine_cs *ring,
diff --git a/drivers/gpu/drm/i915/intel_lrc_tdr.h b/drivers/gpu/drm/i915/intel_lrc_tdr.h
index 4520753..041c808 100644
--- a/drivers/gpu/drm/i915/intel_lrc_tdr.h
+++ b/drivers/gpu/drm/i915/intel_lrc_tdr.h
@@ -32,5 +32,8 @@ enum context_submission_status
 intel_execlists_TDR_get_current_request(struct intel_engine_cs *ring,
 		struct drm_i915_gem_request **req);
 
+bool intel_execlists_TDR_force_CSB_check(struct drm_i915_private *dev_priv,
+					 struct intel_engine_cs *engine);
+
 #endif /* _INTEL_LRC_TDR_H_ */
 
-- 
1.9.1