[Intel-gfx] [PATCH] drm/i915/hsw: enable atomic in L3 for some steppings.

Francisco Jerez currojerez at riseup.net
Sun Jan 4 19:03:16 PST 2015


Zhigang Gong <zhigang.gong at intel.com> writes:

> According to bspec, ROW_CHICKEN3's bit 6 which is to disable
> move of cacheable global atomics to L3 is needed for GT3 D
> stepping.
>
> I enabled it and tested it with HSW GT2 D stepping and GT3 E stepping.
> The atomics works fine in beignet. And it could get more than 10x performance
> improvement with some workload, for an example, the "splat" kernel in darktable,
> without this patch, it consumes 50 seconds in one large raw picture processing.
> But with this patch, the same process only takes less than 1 second.
>

I tried this already (on HSW GT2 D as well) and I don't think it's
enough to get L3 atomics working reliably.  Even though they did seem to
work OK at first glance I observed some corruption issues (e.g. atomic
writes not landing in system memory) when doing atomic writes to
contiguous (as in within the same cache-line) locations in memory.  The
"unused" ARB_shader_image_load_store test [1] I sent to the Piglit
mailing list some time ago exposes this IIRC, and probably a couple of
other tests too.

Also this change is going to cause an instant lock-up anytime Mesa uses
atomics because Mesa doesn't change the default L3 way allocation for
the DC, which turns out to be 0 on HSW.

[1] http://lists.freedesktop.org/archives/piglit/2014-December/013571.html

> Signed-off-by: Zhigang Gong <zhigang.gong at intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 7d99a9c..8a27802 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -5938,10 +5938,12 @@ static void haswell_init_clock_gating(struct drm_device *dev)
>  
>  	ilk_init_lp_watermarks(dev);
>  
> -	/* L3 caching of data atomics doesn't work -- disable it. */
> -	I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
> -	I915_WRITE(HSW_ROW_CHICKEN3,
> -		   _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
> +	if (IS_HSW_GT3(dev) && dev->pdev->revision <= 6) {
> +		/* L3 caching of data atomics doesn't work -- disable it. */
> +		I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
> +		I915_WRITE(HSW_ROW_CHICKEN3,
> +			   _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
> +	}
>  
>  	/* This is required by WaCatErrorRejectionIssue:hsw */
>  	I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
> -- 
> 1.8.3.2
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20150105/d6ba8ca7/attachment.sig>


More information about the Intel-gfx mailing list