[Beignet] [PATCH] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.
Francisco Jerez
currojerez at riseup.net
Tue Jan 31 05:11:43 UTC 2017
Greg KH <gregkh at linuxfoundation.org> writes:
> On Tue, Jan 31, 2017 at 06:11:25AM +0100, Greg KH wrote:
>> On Mon, Jan 30, 2017 at 12:24:45PM -0800, Francisco Jerez wrote:
>> > The WaDisableLSQCROPERFforOCL workaround has the side effect of
>> > disabling an L3SQ optimization that has huge performance implications
>> > and is unlikely to be necessary for the correct functioning of usual
>> > graphic workloads. Userspace is free to re-enable the workaround on
>> > demand, and is generally in a better position to determine whether the
>> > workaround is necessary than the DRM is (e.g. only during the
>> > execution of compute kernels that rely on both L3 fences and HDC R/W
>> > requests).
>> >
>> > The same workaround seems to apply to BDW (at least to production
>> > stepping G1) and SKL as well (the internal workaround database claims
>> > that it does for all steppings, while the BSpec workaround table only
>> > mentions pre-production steppings), but the DRM doesn't do anything
>> > beyond whitelisting the L3SQCREG4 register so userspace can enable it
>> > when it sees fit. Do the same on KBL platforms.
>> >
>> > Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
>> > and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
>> > This is followed by a regression of 35% and 10% respectively for the
>> > same benchmarks and platform caused by my recent patch series
>> > switching userspace to use the dataport constant cache instead of the
>> > sampler to implement uniform pull constant loads, which caused us to
>> > hit more heavily the L3 cache (and on platforms other than KBL had the
>> > opposite effect of improving performance of the same two benchmarks).
>> > The overall effect on KBL of this change combined with the recent
>> > userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf
>> > was affected by the constant cache changes (though it improved as it
>> > did on other platforms rather than regressing), but is not
>> > significantly affected by this patch (with statistical significance of
>> > 5% and sample size 20).
>> >
>> > v2: Drop some more code to avoid unused variable warning.
>> >
>> > Fixes: 738fa1b3123f ("drm/i915/kbl: Add WaDisableLSQCROPERFforOCL")
>> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256
>> > Signed-off-by: Francisco Jerez <currojerez at riseup.net>
>> > Cc: Matthew Auld <matthew.william.auld at gmail.com>
>> > Cc: Eero Tamminen <eero.t.tamminen at intel.com>
>> > Cc: Jani Nikula <jani.nikula at intel.com>
>> > Cc: Mika Kuoppala <mika.kuoppala at intel.com>
>> > Cc: beignet at lists.freedesktop.org
>> > Cc: <stable at vger.kernel.org> # v4.7+
>> > Reviewed-by: Mika Kuoppala <mika.kuoppala at intel.com>
>> > [Removed double Fixes tag]
>> > Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
>> > Link: http://patchwork.freedesktop.org/patch/msgid/1484217894-20505-1-git-send-email-mika.kuoppala@intel.com
>> > (cherry picked from commit 8726f2faa371514fba2f594d799db95203dfeee0)
>> > Signed-off-by: Jani Nikula <jani.nikula at intel.com>
>> > [ Francisco Jerez: Rebase on v4.9 branch. ]
>> > Signed-off-by: Francisco Jerez <currojerez at riseup.net>
>> > ---
>> > drivers/gpu/drm/i915/intel_lrc.c | 3 +--
>> > drivers/gpu/drm/i915/intel_ringbuffer.c | 8 --------
>> > 2 files changed, 1 insertion(+), 10 deletions(-)
>>
>> What is the commit id of this patch in Linus's tree?
>
> Ah, nevermind, it's 4fc020d864647ea3ae8cb8f17d63e48e87ebd0bf, right?
>
Oops, yes, that's right. Thanks!
> thanks,
>
> greg k-h
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/beignet/attachments/20170130/a0a418d4/attachment.sig>
More information about the Beignet
mailing list