[Mesa-dev] [RFC] ACO: A New Compiler Backend for RADV
jason at jlekstrand.net
Wed Jul 3 17:37:17 UTC 2019
Congratulations on this monumental achievement! Bringing up a whole
back-end compiler is a huge amount of work and doing it in a year with only
a couple of people is pretty impressive. It's great to see this work
finally see the light of day and I look forward to seeing how it progresses
On Wed, Jul 3, 2019 at 12:23 PM Daniel Schürmann <daniel at schuermann.dev>
> Hello everyone,
> as some of you already know, for a little over one year I have been
> working on
> an alternate compiler backend for the RADV driver. At the beginning, Bas
> helped out a lot, and since last December Rhys Perry has also helped
> working full-time on ACO.
> In this RFC, I'd like to share with you our motivation for this work as
> well as some
> implementation details and the current state.
> The current development branch of ACO with full commit history can be
> found at
> while a slightly more stable branch is (until upstream) maintained at
> For initial results, I'd like to refer to this post:
> Feel free to ask questions or just add your thoughts.
> The RADV driver currently uses LLVM as backend for shader compilation.
> There are some
> shortcomings regarding LLVM's compilation of graphic shaders which need
> to be addressed.
> The idea and motivation of ACO is the expectation that it would be less
> work long-term to
> re-write the backend than to fix LLVM.
> Without going to much into detail here, the main shortcomings of LLVM
> are compile times and
> the handling of control flow, which has lead to some serious bugs in the
> Additionally, we were able to implement a more aggressive divergence
> analysis and having more
> precise control over register pressure which can ultimately lead to more
> efficient binaries.
> A welcome side-effect is an integrated development process without
> having to deal with LLVM's
> release cycles.
> What started as a proof-of-concept and interesting experimental platform
> advanced quickly to
> a full-featured backend capable of replacing LLVM in the RADV driver in
> the near future.
> ACO is based on principles from recent compiler research results and
> tries to avoid the issues
> we are experiencing with LLVM. The IR is fully SSA-based and also does
> register allocation on
> SSA which allows to precisely pre-calculate the register demand of a
> We implemented the notion of a logical and linear (or physical) control
> flow graph which let us
> quickly and easily add horizontal reductions (thx Connor Abbott) - a
> problem which took almost
> two years and various attempts to solve in LLVM, still being far slower
> than our solution.
> ACO is written with just-in-time compilation in mind and uses data
> structures which are quick to
> iterate. Avoiding pointer-based data structures like linked lists and
> the common def-use chains
> leads to much faster compile times. ACO is fully written in C++.
> Current State:
> Currently, ACO only supports FS and CS, only on VI+ and only on 32bit
> and some 64bit operations.
> It misses VGPR spilling (we didn't need it on any tested game so far)
> and has a theoretical
> issue (in case a divergent/uniform memory write is followed by a memory
> read of the other kind)
> which needs a proper alias analysis to resolve.
> Nevertheless, ACO is able to correctly compile the shaders of (almost?)
> all games including
> complex ones like Shadow of the Tomb Raider and Wolfenstein II.
> We'd like to upstream ACO as experimental driver option to ease
> development synchronization,
> get more feedback, but ultimately also to give access to the performance
> enhancements we achieved.
> To ease the upstreaming efforts, we created MRs for all changes to the
> NIR infrastructure.
> After these MRs have went through the reviewing process and landed, we
> are going to create a
> single MR for ACO. Meanwhile, we will refactor the coding style and
> squash the commits.
> Please review!
> nir: lowering shared memory derefs with nir_lower_io_explicit():
> radv/radeonsi: Use NIR barycentric interpolation intrinsics:
> WIP: nir: add divergence analysis pass:
> WIP: nir: lower int64 in a single pass:
> nir: A Couple of Comparison Optimizations:
> radv: disable lower_sub:
> nir: change nir_lower_io_to_vector() so that it can always vectorize FS
> nir/lower_idiv: add new urcp path:
> nir: add a memory load/store vectorization and combining pass:
> nir: replace nir_move_load_const() with nir_opt_sink():
> We welcome any testing feedback and bug reports at
> Daniel Schürmann
> Rhys Perry
> Bas Nieuwenhuizen
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mesa-dev