[Intel-gfx] [PATCH] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.
chris at chris-wilson.co.uk
Thu Jan 12 10:56:38 UTC 2017
On Thu, Jan 12, 2017 at 12:44:54PM +0200, Mika Kuoppala wrote:
> From: Francisco Jerez <currojerez at riseup.net>
> The WaDisableLSQCROPERFforOCL workaround has the side effect of
> disabling an L3SQ optimization that has huge performance implications
> and is unlikely to be necessary for the correct functioning of usual
> graphic workloads. Userspace is free to re-enable the workaround on
> demand, and is generally in a better position to determine whether the
> workaround is necessary than the DRM is (e.g. only during the
> execution of compute kernels that rely on both L3 fences and HDC R/W
> The same workaround seems to apply to BDW (at least to production
> stepping G1) and SKL as well (the internal workaround database claims
> that it does for all steppings, while the BSpec workaround table only
> mentions pre-production steppings), but the DRM doesn't do anything
> beyond whitelisting the L3SQCREG4 register so userspace can enable it
> when it sees fit. Do the same on KBL platforms.
> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
> This is followed by a regression of 35% and 10% respectively for the
> same benchmarks and platform caused by my recent patch series
> switching userspace to use the dataport constant cache instead of the
> sampler to implement uniform pull constant loads, which caused us to
> hit more heavily the L3 cache (and on platforms other than KBL had the
> opposite effect of improving performance of the same two benchmarks).
> The overall effect on KBL of this change combined with the recent
> userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf
> was affected by the constant cache changes (though it improved as it
> did on other platforms rather than regressing), but is not
> significantly affected by this patch (with statistical significance of
> 5% and sample size 20).
> v2: Drop some more code to avoid unused variable warning.
> Fixes: Fixes: 738fa1b3123f ("drm/i915/kbl: Add WaDisableLSQCROPERFforOCL")
Once is enough :)
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx