[Mesa-dev] [PATCH 00/13] Threaded Gallium for RadeonSI

Rob Clark robdclark at gmail.com
Sun May 14 15:02:22 UTC 2017


On Wed, May 10, 2017 at 6:45 PM, Marek Olšák <maraeo at gmail.com> wrote:
> Hi,
>
> This series adds an optional module into gallium/util that wraps
> around pipe_context and moves execution of all pipe_context calls into
> a separate thread.
>
> It puts a lot of new requirements on the driver, especially on thread-
> safetiness of pipe_context functions, and even expects different
> behavior from pipe_context in some cases, so it may be non-trivial
> to enable. All of it is necessary to have a perfectly scalable
> threaded execution. (Any new drivers should be built around it from
> the beginning)
>
> The performance improvement isn't very high (it's just hiding overhead
> of pipe_context only), but I can tell you and I have tested a lot of
> apps with this, it really doesn't sync the thread with majority of
> apps except for SwapBuffers.
>
> It can do these:
> - unsychronized buffer mappings don't sync
> - ordinary buffer mappings are promoted to unsynchronized when it's safe
> - full buffer invalidations are implemented as reallocations and don't sync
> - partial buffer invalidations are implemented as copy_buffer and don't sync

interesting.. maybe I can drop some of the resource shadowing tricks I
added in freedreno to avoid mid frame texture uploads or UBO updates
from triggering a flush / tile-pass..

BR,
-R

> - get_query_result doesn't sync when the threaded context has seen flush()
>   (i.e. get_query_result is contextless in that case)
>
> Missing:
> - deferred fences - mainly Bioshock Infinite might benefit
> - texture mappings (meaning CPU access) always sync, texture_subdata
>   doesn't sync for small uploads only, but we can make all texture
>   uploads asynchronous by simply copying what is done for buffers
>
> Note that it has a very low overhead when it's always synchronous
> (i.e. not multithreaded), because it's really fast to enqueue and
> execute calls. The worst case scenario might be -3% performance (just
> guessing here).
>
> All requirements on Gallium drivers and other information can be found
> in the header file:
> https://cgit.freedesktop.org/~mareko/mesa/tree/src/gallium/auxiliary/util/u_threaded_context.h?h=gallium-threaded2#n26
>
> RadeonSI enables threaded Gallium by default for OpenGL Core and
> Compatibility profiles and all OpenGL ES variants.
>
> There is a small performance concern for RadeonSI: If non-contiguous
> VRAM mappings are not supported (amdgpu - kernel 4.11 and older,
> radeon - all kernels), the performance difference might be negative,
> because buffer invalidations are done unconditionally, meaning that
> there can be more live and mapped VRAM buffers. It's difficult to tell
> whether any real apps are affected in a measurable way.
>
> Here are performance numbers:
>
> APPS: MORE IS BETTER
> Alien Isolation: +16%
> Bioshock Infinite: +13%
> Borderlands 2: +12%
> Civilization 5: +12%
> Civilization 6: +10%
> CS:GO: +8%
> ET Legacy: +12%
> Openarena: +27%
> Talos Principle (high details, 1680x1050 internal resolution): +17%
> glmark2: no change in the final score
>
> When games are GPU-bound: no change
>
> Because of not taking advantage of deferred fences, Bioshock runs
> 80% of time asynchronously and 20% of time synchronously.
> All other games run 100% of time asynchronously.
>
> x11perf: MORE IS BETTER
> x11perf: Test: 500px PutImage Square: -3%
> x11perf: Test: Scrolling 500 x 500 px: +16%
> x11perf: Test: Char in 80-char aa line: +13%
> x11perf: Test: PutImage XY 500x500 Square: +1%
> x11perf: Test: Fill 300 x 300px AA Trapezoid: NO CHANGE
> x11perf: Test: 500px Copy From Window To Window: +14%
> x11perf: Test: Copy 500x500 From Pixmap To Pixmap: -1%
> x11perf: Test: 500px Compositing From Pixmap To Window: +21%
> x11perf: Test: 500px Compositing From Window To Window: +18%
>
> gtkperf: LESS IS BETTER
> gtkperf: GTK Widget: Total Time: -2%
> gtkperf: GTK Widget: GtkComboBox: +7%
> gtkperf: GTK Widget: GtkCheckButton: -15%
> gtkperf: GTK Widget: GtkRadioButton: -13%
> gtkperf: GTK Widget: GtkToggleButton: -2%
> gtkperf: GTK Widget: GtkComboBoxEntry: -1%
> gtkperf: GTK Widget: GtkTextView - Scroll: NO CHANGE
> gtkperf: GTK Widget: GtkTextView - Add Text: NO CHANGE
> gtkperf: GTK Widget: GtkDrawingArea - Circles: -9%
> gtkperf: GTK Widget: GtkDrawingArea - Pixbufs: -3%
>
> Hence the decision to enable it by default.
>
> Please review.
>
> Marek
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list