[Intel-gfx] [PATCH] drm/i915/hsw: enable atomic in L3 for some steppings.

Zhigang Gong zhigang.gong at intel.com
Sun Jan 4 17:05:50 PST 2015


According to bspec, ROW_CHICKEN3's bit 6 which is to disable
move of cacheable global atomics to L3 is needed for GT3 D
stepping.

I enabled it and tested it with HSW GT2 D stepping and GT3 E stepping.
The atomics works fine in beignet. And it could get more than 10x performance
improvement with some workload, for an example, the "splat" kernel in darktable,
without this patch, it consumes 50 seconds in one large raw picture processing.
But with this patch, the same process only takes less than 1 second.

Signed-off-by: Zhigang Gong <zhigang.gong at intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 7d99a9c..8a27802 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5938,10 +5938,12 @@ static void haswell_init_clock_gating(struct drm_device *dev)
 
 	ilk_init_lp_watermarks(dev);
 
-	/* L3 caching of data atomics doesn't work -- disable it. */
-	I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
-	I915_WRITE(HSW_ROW_CHICKEN3,
-		   _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
+	if (IS_HSW_GT3(dev) && dev->pdev->revision <= 6) {
+		/* L3 caching of data atomics doesn't work -- disable it. */
+		I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
+		I915_WRITE(HSW_ROW_CHICKEN3,
+			   _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
+	}
 
 	/* This is required by WaCatErrorRejectionIssue:hsw */
 	I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
-- 
1.8.3.2



More information about the Intel-gfx mailing list