[Mesa-dev] [RFC] ACO: A New Compiler Backend for RADV
Daniel Schürmann
daniel at schuermann.dev
Wed Jul 3 17:23:22 UTC 2019
Hello everyone,
as some of you already know, for a little over one year I have been
working on
an alternate compiler backend for the RADV driver. At the beginning, Bas
Nieuwenhuizen
helped out a lot, and since last December Rhys Perry has also helped
tremendously
working full-time on ACO.
In this RFC, I'd like to share with you our motivation for this work as
well as some
implementation details and the current state.
The current development branch of ACO with full commit history can be
found at
https://github.com/daniel-schuermann/mesa/tree/backend
while a slightly more stable branch is (until upstream) maintained at
https://github.com/daniel-schuermann/mesa/tree/master/
For initial results, I'd like to refer to this post:
https://steamcommunity.com/games/221410/announcements/detail/1602634609636894200
Feel free to ask questions or just add your thoughts.
Motivation:
The RADV driver currently uses LLVM as backend for shader compilation.
There are some
shortcomings regarding LLVM's compilation of graphic shaders which need
to be addressed.
The idea and motivation of ACO is the expectation that it would be less
work long-term to
re-write the backend than to fix LLVM.
Without going to much into detail here, the main shortcomings of LLVM
are compile times and
the handling of control flow, which has lead to some serious bugs in the
past.
Additionally, we were able to implement a more aggressive divergence
analysis and having more
precise control over register pressure which can ultimately lead to more
efficient binaries.
A welcome side-effect is an integrated development process without
having to deal with LLVM's
release cycles.
Implementation:
What started as a proof-of-concept and interesting experimental platform
advanced quickly to
a full-featured backend capable of replacing LLVM in the RADV driver in
the near future.
ACO is based on principles from recent compiler research results and
tries to avoid the issues
we are experiencing with LLVM. The IR is fully SSA-based and also does
register allocation on
SSA which allows to precisely pre-calculate the register demand of a shader.
We implemented the notion of a logical and linear (or physical) control
flow graph which let us
quickly and easily add horizontal reductions (thx Connor Abbott) - a
problem which took almost
two years and various attempts to solve in LLVM, still being far slower
than our solution.
ACO is written with just-in-time compilation in mind and uses data
structures which are quick to
iterate. Avoiding pointer-based data structures like linked lists and
the common def-use chains
leads to much faster compile times. ACO is fully written in C++.
Current State:
Currently, ACO only supports FS and CS, only on VI+ and only on 32bit
and some 64bit operations.
It misses VGPR spilling (we didn't need it on any tested game so far)
and has a theoretical
issue (in case a divergent/uniform memory write is followed by a memory
read of the other kind)
which needs a proper alias analysis to resolve.
Nevertheless, ACO is able to correctly compile the shaders of (almost?)
all games including
complex ones like Shadow of the Tomb Raider and Wolfenstein II.
We'd like to upstream ACO as experimental driver option to ease
development synchronization,
get more feedback, but ultimately also to give access to the performance
enhancements we achieved.
To ease the upstreaming efforts, we created MRs for all changes to the
NIR infrastructure.
After these MRs have went through the reviewing process and landed, we
are going to create a
single MR for ACO. Meanwhile, we will refactor the coding style and
squash the commits.
Please review!
nir: lowering shared memory derefs with nir_lower_io_explicit():
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/622
radv/radeonsi: Use NIR barycentric interpolation intrinsics:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/906
WIP: nir: add divergence analysis pass:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/918
WIP: nir: lower int64 in a single pass:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1224
nir: A Couple of Comparison Optimizations:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1228
radv: disable lower_sub:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1236
nir: change nir_lower_io_to_vector() so that it can always vectorize FS
outputs:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1238
nir/lower_idiv: add new urcp path:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1239
nir: add a memory load/store vectorization and combining pass:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1240
nir: replace nir_move_load_const() with nir_opt_sink():
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1241
We welcome any testing feedback and bug reports at
https://github.com/daniel-schuermann/mesa/issues
Thanks,
Daniel Schürmann
Rhys Perry
Bas Nieuwenhuizen
More information about the mesa-dev
mailing list