[Intel-gfx] [PATCH 1/2] drm/i915: Reset hangcheck score if we succesfully kick a ring

Chris Wilson chris at chris-wilson.co.uk
Thu Jun 6 10:37:49 CEST 2013


After kicking a ring, it should be free to make progress again and so
should not be accused of being stuck until hangcheck fires once more.

This should address part of Ben's justified criticism of

commit 05407ff889ceebe383aa5907219f86582ef96b72
Author: Mika Kuoppala <mika.kuoppala at linux.intel.com>
Date:   Thu May 30 09:04:29 2013 +0300

    drm/i915: detect hang using per ring hangcheck_score

"There's also another corner case on the kick. If the seqno = 2
(though not stuck), and on the 3rd hangcheck, the ring is stuck, and
we try to kick it... we don't actually try to find out if the kick
helped"

References: https://bugs.freedesktop.org/show_bug.cgi?id=65394
Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
Cc: Ben Widawsky <ben at bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_irq.c |   47 ++++++++++++++++++++-------------------
 1 file changed, 24 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 85694d7..3b3f080 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2372,16 +2372,26 @@ static bool semaphore_passed(struct intel_ring_buffer *ring)
 				 ioread32(ring->virtual_start+acthd+4)+1);
 }
 
-static bool kick_ring(struct intel_ring_buffer *ring)
+static bool i915_hangcheck_ring_hung(struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	u32 tmp = I915_READ_CTL(ring);
+	u32 tmp;
+
+	if (IS_GEN2(dev))
+		return true;
+
+	/* Is the chip hanging on a WAIT_FOR_EVENT?
+	 * If so we can simply poke the RB_WAIT bit
+	 * and break the hang. This should work on
+	 * all but the second generation chipsets.
+	 */
+	tmp = I915_READ_CTL(ring);
 	if (tmp & RING_WAIT) {
 		DRM_ERROR("Kicking stuck wait on %s\n",
 			  ring->name);
 		I915_WRITE_CTL(ring, tmp);
-		return true;
+		return false;
 	}
 
 	if (INTEL_INFO(dev)->gen >= 6 &&
@@ -2390,22 +2400,10 @@ static bool kick_ring(struct intel_ring_buffer *ring)
 		DRM_ERROR("Kicking stuck semaphore on %s\n",
 			  ring->name);
 		I915_WRITE_CTL(ring, tmp);
-		return true;
-	}
-	return false;
-}
-
-static bool i915_hangcheck_ring_hung(struct intel_ring_buffer *ring)
-{
-	if (IS_GEN2(ring->dev))
 		return false;
+	}
 
-	/* Is the chip hanging on a WAIT_FOR_EVENT?
-	 * If so we can simply poke the RB_WAIT bit
-	 * and break the hang. This should work on
-	 * all but the second generation chipsets.
-	 */
-	return !kick_ring(ring);
+	return true;
 }
 
 /**
@@ -2445,15 +2443,18 @@ void i915_hangcheck_elapsed(unsigned long data)
 		} else {
 			busy_count++;
 
+			err = false;
 			if (ring->hangcheck.seqno == seqno) {
-				ring->hangcheck.score++;
-
-				/* Kick ring if stuck*/
+				/* Kick ring if stuck */
+				err = true;
 				if (stuck[i])
-					i915_hangcheck_ring_hung(ring);
-			} else {
-				ring->hangcheck.score = 0;
+					err = i915_hangcheck_ring_hung(ring);
 			}
+
+			if (err)
+				ring->hangcheck.score++;
+			else
+				ring->hangcheck.score = 0;
 		}
 
 		ring->hangcheck.seqno = seqno;
-- 
1.7.10.4




More information about the Intel-gfx mailing list