[Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute

Wed Feb 13 05:15:55 UTC 2019

Hi,

This patch series uses async compute to do primitive culling before
the vertex shader. It significantly improves performance for applications
that use a lot of geometry that is invisible because primitives don't
intersect sample points or there are a lot of back faces, etc.

It passes 99.9999% of all tests (GL CTS, dEQP, piglit) and is 100% stable.
It supports all chips all the way from Sea Islands to Radeon VII.

As you can see in the results marked (ENABLED) in the picture below,
it destroys our competition (The GeForce results are from a Phoronix
article from 2017, the latest ones I could find):

Benchmark: ParaView - Many Spheres - 2560x1440
https://people.freedesktop.org/~mareko/prim-discard-cs-results.png

The last patch describes the implementation and functional limitations
if you can find the huge code comment, so I'm not gonna do that here.

I decided to enable this optimization on all Pro graphics cards.
The reason is that I haven't had time to benchmark games.
This decision may be changed based on community feedback, etc.

People using the Pro graphics cards can disable this by setting
AMD_DEBUG=nopd, and people using consumer graphics cards can enable
this by setting AMD_DEBUG=pd. So you always have a choice.

Eventually we might also enable this on consumer graphics cards for those
games that benefit. It might decrease performance if there is not enough
invisible geometry.

Branch:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs

Please review.

Thanks,
Marek