<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - [IVB/HSW/BYT-M Bisected]3D ( SynMark2_v5.3.0 /SynMark2_v6.0 /Lightsmarkv2008 &/warsow_v1.0 /GLBenchmarkv2.7.0/GpuTest GiMark) performance reduced 20%~90%"
href="https://bugs.freedesktop.org/show_bug.cgi?id=80792#c12">Comment # 12</a>
on <a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - [IVB/HSW/BYT-M Bisected]3D ( SynMark2_v5.3.0 /SynMark2_v6.0 /Lightsmarkv2008 &/warsow_v1.0 /GLBenchmarkv2.7.0/GpuTest GiMark) performance reduced 20%~90%"
href="https://bugs.freedesktop.org/show_bug.cgi?id=80792">bug 80792</a>
from <span class="vcard"><a class="email" href="mailto:eero.t.tamminen@intel.com" title="Eero Tamminen <eero.t.tamminen@intel.com>"> <span class="fn">Eero Tamminen</span></a>
</span></b>
<pre>(In reply to <a href="show_bug.cgi?id=80792#c9">comment #9</a>)
<span class="quote">> But zhoujian's comment above suggested that the patches I uploaded here
> fixed the performance drop on the reported platforms... Eero are you testing
> this with master and the two attached patches on top or with a clean
> checkout of master? I think Kenneth has not pushed these patches to master
> yet.</span >
Kenneth commented that they had failures, so they're no-go.
But I had tested performance with them: Egpyt, T-Rex and GiMark have still few
(4-5%) percent regression, and basic TriList test has >30% regression (on HSW
GT3e). IMHO its too much even if the patch would work.
FYI: the max amount of primitives per frame are fairly high with these tests):
- T-Rex: 3.8M
- Egypt: 3.2M
- GiMark: 38.0M (without instancing 420)
- Valley: 6.4M (without instancing 4.2M)
- Heaven: 3.8M (without instancing 3.2M)
- TriList: 6.0M
(In reply to <a href="show_bug.cgi?id=80792#c11">comment #11</a>)
<span class="quote">> We'll have to come up with a proper implementation for
> GL_PRIMITIVES_GENERATED in the multiple streams case. It looks like none of
> the counters work in all cases. We may need to use atomics in the geometry
> shader to count things manually for streams 1-3, and hack stream 0 to use
> CL_INVOCATIONS_COUNT unless a GS program that uses streams is active, then
> use atomics, and add both sources together to get the final value. Ugly,
> but I can't think of anything better...</span >
Do you mean that the original patch didn't provide the correct
GL_PRIMITIVES_GENERATED counts, or that there are some other problems with just
doing:
if geometry shader
use SO_PRIM_STORAGE_NEEDED // for all streams
else
use CL_INVOCATIONS_COUNT
(Is there some way to use non-zero streams without geometry shader?)</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the QA Contact for the bug.</li>
</ul>
</body>
</html>