[Intel-gfx] [PATCH v2] drm/i915: Prevent TLB error on first execution on SNB

Chris Wilson chris at chris-wilson.co.uk
Fri Feb 13 06:35:59 PST 2015


Long ago I found that I was getting sporadic errors when booting SNB,
with the symptom being that the first batch died with IPEHR != *ACTHD,
typically caused by the TLB being invalid. These magically disappeared
if I held the forcewake during the entire ring initialisation sequence.
(It can probably be shortened to a short critical section, but the whole
initialisation is full of register writes and so we would be taking and
releasing forcewake almost continually, and so holding it over the
entire sequence will probably be a net win!)

Note some of the kernels I encounted the issue already had the deferred
forcewake release, so it is still relevant.

I know that there have been a few other reports with similar failure
conditions on SNB, I think such as
References: https://bugs.freedesktop.org/show_bug.cgi?id=80913

v2: Wrap i915_gem_init_hw() with its own security blanket as we take
that path following resume and reset.

Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8d15c8110962..08450922f373 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4783,6 +4783,9 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
 		return -EIO;
 
+	/* Double layer security blanket, see i915_gem_init() */
+	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
 	if (dev_priv->ellc_size)
 		I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
 
@@ -4815,7 +4818,7 @@ i915_gem_init_hw(struct drm_device *dev)
 	for_each_ring(ring, dev_priv, i) {
 		ret = ring->init_hw(ring);
 		if (ret)
-			return ret;
+			goto out;
 	}
 
 	for (i = 0; i < NUM_L3_SLICES(dev); i++)
@@ -4832,9 +4835,11 @@ i915_gem_init_hw(struct drm_device *dev)
 		DRM_ERROR("Context enable failed %d\n", ret);
 		i915_gem_cleanup_ringbuffer(dev);
 
-		return ret;
+		goto out;
 	}
 
+out:
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 	return ret;
 }
 
@@ -4868,6 +4873,14 @@ int i915_gem_init(struct drm_device *dev)
 		dev_priv->gt.stop_ring = intel_logical_ring_stop;
 	}
 
+	/* This is just a security blanket to placate dragons.
+	 * On some systems, we very sporadically observe that the first TLBs
+	 * used by the CS may be stale, despite us poking the TLB reset. If
+	 * we hold the forcewake during initialisation these problems
+	 * just magically go away.
+	 */
+	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
 	ret = i915_gem_init_userptr(dev);
 	if (ret)
 		goto out_unlock;
@@ -4894,6 +4907,7 @@ int i915_gem_init(struct drm_device *dev)
 	}
 
 out_unlock:
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 	mutex_unlock(&dev->struct_mutex);
 
 	return ret;
-- 
2.1.4



More information about the Intel-gfx mailing list