[Bug 80792] [IVB/HSW/BYT-M Bisected]3D ( SynMark2_v5.3.0 /SynMark2_v6.0 /Lightsmarkv2008 &/warsow_v1.0 /GLBenchmarkv2.7.0/GpuTest GiMark) performance reduced 20%~90%

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Fri Jul 18 01:17:10 PDT 2014


https://bugs.freedesktop.org/show_bug.cgi?id=80792

--- Comment #13 from Iago Toral <itoral at igalia.com> ---
(In reply to comment #12)
> (In reply to comment #9)
> > But zhoujian's comment above suggested that the patches I uploaded here
> > fixed the performance drop on the reported platforms... Eero are you testing
> > this with master and the two attached patches on top or with a clean
> > checkout of master? I think Kenneth has not pushed these patches to master
> > yet.
> 
> Kenneth commented that they had failures, so they're no-go.
> 
> But I had tested performance with them: Egpyt, T-Rex and GiMark have still
> few (4-5%) percent regression, and basic TriList test has >30% regression
> (on HSW GT3e).  IMHO its too much even if the patch would work.

Yeah, I agree.

> FYI: the max amount of primitives per frame are fairly high with these
> tests):
> - T-Rex:   3.8M
> - Egypt:   3.2M
> - GiMark: 38.0M  (without instancing 420)
> - Valley:  6.4M  (without instancing 4.2M)
> - Heaven:  3.8M  (without instancing 3.2M)
> - TriList: 6.0M 
> 
> 
> (In reply to comment #11)
> > We'll have to come up with a proper implementation for
> > GL_PRIMITIVES_GENERATED in the multiple streams case.  It looks like none of
> > the counters work in all cases.  We may need to use atomics in the geometry
> > shader to count things manually for streams 1-3, and hack stream 0 to use
> > CL_INVOCATIONS_COUNT unless a GS program that uses streams is active, then
> > use atomics, and add both sources together to get the final value.  Ugly,
> > but I can't think of anything better...
> 
> Do you mean that the original patch didn't provide the correct
> GL_PRIMITIVES_GENERATED counts, or that there are some other problems with
> just doing:
>   if geometry shader
>     use SO_PRIM_STORAGE_NEEDED // for all streams
>   else
>     use CL_INVOCATIONS_COUNT

I think there are two problems here:

One is related to performance and I think we don't want to have that even if we
could reduce it only to the case where we have geometry shaders active. The
problem here is related to the fact that we need to activate the SOL unit even
when we don't need to do transform feedback, and that seems to come with a
severe performance penalty, as you discovered. 

Besides that, I think there are some other problems specific to Haswell where
this patch is not producing correct counts. I think that multi-stream patches
have not been ported to Haswell yet though, so maybe it is related to that but
it could also be something else. Maybe Kenneth can confirm this.

> (Is there some way to use non-zero streams without geometry shader?)

No, streams are only available in geometry shaders.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20140718/5b2d9024/attachment.html>


More information about the intel-3d-bugs mailing list