[Mesa-dev] [RFC] ACO: A New Compiler Backend for RADV
daniel at schuermann.dev
Wed Jul 3 17:23:22 UTC 2019
as some of you already know, for a little over one year I have been
an alternate compiler backend for the RADV driver. At the beginning, Bas
helped out a lot, and since last December Rhys Perry has also helped
working full-time on ACO.
In this RFC, I'd like to share with you our motivation for this work as
well as some
implementation details and the current state.
The current development branch of ACO with full commit history can be
while a slightly more stable branch is (until upstream) maintained at
For initial results, I'd like to refer to this post:
Feel free to ask questions or just add your thoughts.
The RADV driver currently uses LLVM as backend for shader compilation.
There are some
shortcomings regarding LLVM's compilation of graphic shaders which need
to be addressed.
The idea and motivation of ACO is the expectation that it would be less
work long-term to
re-write the backend than to fix LLVM.
Without going to much into detail here, the main shortcomings of LLVM
are compile times and
the handling of control flow, which has lead to some serious bugs in the
Additionally, we were able to implement a more aggressive divergence
analysis and having more
precise control over register pressure which can ultimately lead to more
A welcome side-effect is an integrated development process without
having to deal with LLVM's
What started as a proof-of-concept and interesting experimental platform
advanced quickly to
a full-featured backend capable of replacing LLVM in the RADV driver in
the near future.
ACO is based on principles from recent compiler research results and
tries to avoid the issues
we are experiencing with LLVM. The IR is fully SSA-based and also does
register allocation on
SSA which allows to precisely pre-calculate the register demand of a shader.
We implemented the notion of a logical and linear (or physical) control
flow graph which let us
quickly and easily add horizontal reductions (thx Connor Abbott) - a
problem which took almost
two years and various attempts to solve in LLVM, still being far slower
than our solution.
ACO is written with just-in-time compilation in mind and uses data
structures which are quick to
iterate. Avoiding pointer-based data structures like linked lists and
the common def-use chains
leads to much faster compile times. ACO is fully written in C++.
Currently, ACO only supports FS and CS, only on VI+ and only on 32bit
and some 64bit operations.
It misses VGPR spilling (we didn't need it on any tested game so far)
and has a theoretical
issue (in case a divergent/uniform memory write is followed by a memory
read of the other kind)
which needs a proper alias analysis to resolve.
Nevertheless, ACO is able to correctly compile the shaders of (almost?)
all games including
complex ones like Shadow of the Tomb Raider and Wolfenstein II.
We'd like to upstream ACO as experimental driver option to ease
get more feedback, but ultimately also to give access to the performance
enhancements we achieved.
To ease the upstreaming efforts, we created MRs for all changes to the
After these MRs have went through the reviewing process and landed, we
are going to create a
single MR for ACO. Meanwhile, we will refactor the coding style and
squash the commits.
nir: lowering shared memory derefs with nir_lower_io_explicit():
radv/radeonsi: Use NIR barycentric interpolation intrinsics:
WIP: nir: add divergence analysis pass:
WIP: nir: lower int64 in a single pass:
nir: A Couple of Comparison Optimizations:
radv: disable lower_sub:
nir: change nir_lower_io_to_vector() so that it can always vectorize FS
nir/lower_idiv: add new urcp path:
nir: add a memory load/store vectorization and combining pass:
nir: replace nir_move_load_const() with nir_opt_sink():
We welcome any testing feedback and bug reports at
More information about the mesa-dev