[Mesa-dev] [Bug 97549] [SNB, BXT] up to 40% perf drop from "loader/dri3: Overhaul dri3_update_num_back" commit
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Wed Aug 31 13:55:01 UTC 2016
https://bugs.freedesktop.org/show_bug.cgi?id=97549
Bug ID: 97549
Summary: [SNB, BXT] up to 40% perf drop from "loader/dri3:
Overhaul dri3_update_num_back" commit
Product: Mesa
Version: git
Hardware: x86-64 (AMD64)
OS: All
Status: NEW
Severity: normal
Priority: high
Component: Other
Assignee: mesa-dev at lists.freedesktop.org
Reporter: eero.t.tamminen at intel.com
QA Contact: mesa-dev at lists.freedesktop.org
CC: michel at daenzer.net
Following commit regresses performance hugely with DRI3 in synthetic benchmarks
both on Sandybridge and Broxton.
commit 1e3218bc5ba2b739261f0c0bacf4eb662d377236
Author: Michel Dänzer <michel.daenzer at amd.com>
AuthorDate: Wed Aug 17 17:02:04 2016 +0900
Commit: Michel Dänzer <michel at daenzer.net>
CommitDate: Thu Aug 25 17:40:24 2016 +0900
loader/dri3: Overhaul dri3_update_num_back
Always use 3 buffers when flipping. With only 2 buffers, we have to wait
for a flip to complete (which takes non-0 time even with asynchronous
flips) before we can start working on the next frame. We were previously
only using 2 buffers for flipping if the X server supports asynchronous
flips, even when we're not using asynchronous flips. This could result
in bad performance (the referenced bug report is an extreme case, where
the inter-frame stalls were preventing the GPU from reaching its maximum
clocks).
I couldn't measure any performance boost using 4 buffers with flipping.
Performance actually seemed to go down slightly, but that might have
been just noise.
Without flipping, a single back buffer is enough for swap interval 0,
but we need to use 2 back buffers when the swap interval is non-0,
otherwise we have to wait for the swap interval to pass before we can
start working on the next frame. This condition was previously reversed.
Cc: "12.0 11.2" <mesa-stable at lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97260
Reviewed-by: Frank Binns <frank.binns at imgtec.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
Reverting the batch restores earlier performance (bisect done on Broxton,
revert tested on Sandybridge, so same commit is problem for both).
Impact is larger for tests with higher FPS, and naturally affects only onscreen
versions of the tests. Both fullscreen and windowed+composited tests were
affected.
On Sandybridge impact is up to 35% (SynMark Batch tests), 25% in GpuTest
Triangle test, and less in other tests.
On Broxton the drop affects more tests (due to better GPU, heavier tests have
higher FPS), even few tests that are normally fully ALU bound:
* SynMark v6: up to 40% (Batch tests)
* GfxBench v4: 35% ALU, 25% Driver, 10% Tess tests
* Lightsmark 2008: 20%
* GpuTest 0.7: 15% Triangle, Julia32 & Plot3D tests
* GLB 2.7: 10% Egypt
The change doesn't seem to affect HSW, BDW nor SKL. I don't know why.
Issue doesn't seem to be related to FPS (occasionally) being limited to some
multiple of 60 FPS like in the earlier DRI3 perf bug. My assumption is that
the buffering change indirectly affects some memory setting, but I don't know
what, as SNB & BXT different in that respect:
* SNB has LLC, but BXT doesn't
* AFAIK Intel DDX supports SNA for SNB, but not yet for BXT
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160831/e173783b/attachment.html>
More information about the mesa-dev
mailing list