[Mesa-dev] [PATCH 0/8] Gallium & RadeonSI optimization for Ryzen CPUs

Marek Olšák maraeo at gmail.com
Thu Sep 6 04:02:21 UTC 2018


Hi,

When the Ryzen CPUs were launched, they didn't perform very well in
games, and it took a while before games were patched. Guess what,
Mesa drivers have suffered from the same inefficincies until now.

The AMD Zen architecture has multiple core complexes (CCX) where each
CCX has e.g. 4C/8T and always one L3 cache. If application and driver
threads don't run on the same CCX, communication between threads is
slow, because multiple L3 caches must maintain coherency between them.
Atomic operations seem to suffer the most, almost as if they were
uncached. (are they?)

This series pins the application thread and all driver execution
threads to 1 L3 cache (1 CCX). If the application thread is already
pinned to a hw thread or core(s), all driver threads are pinned to
the same L3 cache (CCX) as the application thread.

Shader compiler threads are unpinned, as they are not critical.

The piglit/drawoverhead microbenchmark shows that this increases
performance by 32% for DrawElements and 25% for DrawArrays on Ryzen
1st-Gen CPUs. It will probably be much less with real apps.

Please review.

Thanks,
Marek


More information about the mesa-dev mailing list