[Intel-gfx] [PATCH 1/2] drm/i915: Fix incoherence with fence updates on Sandybridge+

Chris Wilson chris at chris-wilson.co.uk
Tue Jul 9 18:54:39 CEST 2013


This hopefully fixes the root cause behind the workaround from

commit 25ff1195f8a0b3724541ae7bbe331b4296de9c06
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Thu Apr 4 21:31:03 2013 +0100

    drm/i915: Workaround incoherence between fences and LLC across multiple CPUs

Thanks to further investigation by Jon Bloomfield, he realised that
the 64-bit register might be broken up by the hardware into two 32-bit
writes (a problem we have encountered elsewhere). This non-atomicity
would then cause an issue where a second thread would see a random
content of its thread register, and so read/write into a fairly random
tiled location. Breaking the operation into 3 explicit 32-bit updates
(first disable the fence, poke the upper bits, then poke the lower bits
and enable) ensures that, given proper serialisation between the
32-bit register write and the memory transfer, that the fence value is
always consistent.

Note to self: this implies that we have multiple threads hitting the
fence whilst it is being updated, but the sequence of steps we take to
quiesce the memory accesses from the old and new fenced objects across
the update should prevent anyone using the fence register during the
update.

However, Daniel points out that the intermediate fence value may be seen
by some other random thread as the intermediate value conflicts with its
own fence register.

Signed-off-by: Jon Bloomfield <jon.bloomfield at intel.com>
Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
Cc: Carsten Emde <C.Emde at osadl.org>
Cc: stable at vger.kernel.org
---
 drivers/gpu/drm/i915/i915_gem.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3406c76..ce46e777 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2777,7 +2777,6 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg,
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	int fence_reg;
 	int fence_pitch_shift;
-	uint64_t val;
 
 	if (INTEL_INFO(dev)->gen >= 6) {
 		fence_reg = FENCE_REG_SANDYBRIDGE_0;
@@ -2787,8 +2786,11 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg,
 		fence_pitch_shift = I965_FENCE_PITCH_SHIFT;
 	}
 
+	fence_reg += reg * 8;
+
 	if (obj) {
 		u32 size = i915_gem_obj_ggtt_size(obj);
+		uint64_t val;
 
 		val = (uint64_t)((i915_gem_obj_ggtt_offset(obj) + size - 4096) &
 				 0xfffff000) << 32;
@@ -2796,12 +2798,14 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg,
 		val |= (uint64_t)((obj->stride / 128) - 1) << fence_pitch_shift;
 		if (obj->tiling_mode == I915_TILING_Y)
 			val |= 1 << I965_FENCE_TILING_Y_SHIFT;
-		val |= I965_FENCE_REG_VALID;
+
+		/* W/a incoherency with non-atomic 64-bit register updates */
+		I915_WRITE(fence_reg + 0, (uint32_t)val); /* first we disable */
+		I915_WRITE(fence_reg + 4, (uint32_t)(val >> 32));
+		I915_WRITE(fence_reg + 0, (uint32_t)(val | I965_FENCE_REG_VALID));
 	} else
-		val = 0;
+		I915_WRITE64(fence_reg, 0);
 
-	fence_reg += reg * 8;
-	I915_WRITE64(fence_reg, val);
 	POSTING_READ(fence_reg);
 }
 
-- 
1.8.3.2




More information about the Intel-gfx mailing list