[Mesa-dev] [PATCH 00/15] radv: Support for VK_AMD_shader_ballot
Connor Abbott
connora at valvesoftware.com
Tue Aug 8 01:32:26 UTC 2017
From: Connor Abbott <cwabbott0 at gmail.com>
This series implements VK_AMD_shader_ballot for radv. This extension
builds on VK_EXT_shader_subgroup_ballot and VK_EXT_shader_subgroup_vote
by adding a number of reductions across a subgroup (or wavefront in AMD
terminology). Previously, shaders had to use shared memory to compute,
say, the average across all threads in a workgroup, or the minimum and
maximum values across a workgroup. But that requires a lot of accesses
to LDS memory, which is (relatively) slow. This extension allows the
shader to do part of the reduction directly in registers, as long as it
stays within a single wavefront, reducing the amount of traffic to the
LDS that has to happen. It also adds a few AMD-specific instructions,
like mbcnt. To get an idea of what exactly is in the extension, and what
inclusive scan, exclusive scan, etc. mean, you can look at the GL
extension which exposes mostly the same things [1].
Why should you care? It turns out that with this extension enabled, plus
a few other AMD-specific extensions that are mostly trivial, DOOM will
take a different path that uses shaders that were tuned specifically for
AMD hardware. I haven't actually tested DOOM yet, since a few more
things need to be wired up, but it's a lot less work than this extension
and I'm sure Dave or Bas will be do it for me when they get around to it
:).
It uses a few new features of the AMDGPU LLVM backend that I just
landed, as well as one more small change that still needs review:
https://reviews.llvm.org/D34718, so it's going to require LLVM 6.0. It
also uses the DPP modifier that was only added on VI since that was
easier than using ds_swizzle (which is available on all GCN cards). It
should be possible to implement support for older cards using
ds_swizzle, but I haven't gotten to it yet. A note to those reviewing:
it might be helpful to look at the LLVM changes that this series uses,
in particular:
https://reviews.llvm.org/rL310087
https://reviews.llvm.org/rL310088
https://reviews.llvm.org/D34718
in order to get the complete picture.
This series depends on my previous series [2] to implement
VK_EXT_shader_subgroup_vote and VK_EXT_shader_subgroup_ballot, if
nothing else in order to be able to test the implementation. I think
DOOM also uses the latter two extensions. I've also based on my series
adding cross-thread semantics to NIR [3], which Jason needs to review,
since I was hoping that would land first, although with a little effort
it should be possible to land this first (it would require changing
PATCH 01 a little). The whole thing is available at:
git://people.freedesktop.org/~cwabbott0/mesa radv-amd-shader-ballot
and the LLVM branch that I've been using to test, with the one patch
added is at:
https://github.com/cwabbott0/llvm.git dpp-intrinsics-v4
I've got some Crucible tests for exercising the various different parts
of the implementation, although I didn't bother to test all the possible
combinations of reductions, since they didn't really require any special
code to implement anyways. I'll try and get that cleaned up and sent out
soon. Maybe I should just push the tests?
Finally, I'm leaving Valve soon (this week) to go back to school, and I
suspect that I won't have too much time to work on this afterwards, so
someone else will probably have to pick it up. I've been working on this
for most of the summer, since it turned out to be a way more complicated
beast to implement than I thought. It's required changes across the
entire stack, from spirv-to-nir all the way down to register allocation
in the LLVM backend. Thankfully, though, most of the tricky LLVM
changes have landed (thanks Nicolai for reviewing!) and what's left is a
lot more straightforward. I should still be around to answer questions,
though. Whew!
[1] https://www.khronos.org/registry/OpenGL/extensions/AMD/AMD_shader_ballot.txt
[2] https://lists.freedesktop.org/archives/mesa-dev/2017-August/164903.html
[3] https://lists.freedesktop.org/archives/mesa-dev/2017-August/164898.html
Connor Abbott (15):
nir: define intrinsics needed for AMD_shader_ballot
spirv: import AMD extensions header
spirv: add plumbing for SPV_AMD_shader_ballot and Groups
nir: rename and generalize nir_lower_read_invocation_to_scalar
nir: scalarize AMD_shader_ballot intrinsics
radv: call nir_lower_cross_thread_to_scalar()
nir: add a lowering pass for some cross-workgroup intrinsics
radv: use nir_lower_group_reduce()
ac: move ac_to_integer() and ac_to_float() to ac_llvm_build.c
ac: remove bitcast_to_float()
ac: fix ac_get_type_size() for doubles
ac: add support for SPV_AMD_shader_ballot
ac/nir: add support for SPV_AMD_shader_ballot
radv: enable VK_AMD_shader_ballot
ac/nir: fix saturate emission
src/amd/common/ac_llvm_build.c | 783 ++++++++++++++++++++-
src/amd/common/ac_llvm_build.h | 120 ++++
src/amd/common/ac_nir_to_llvm.c | 300 ++++----
src/amd/vulkan/radv_device.c | 15 +
src/amd/vulkan/radv_pipeline.c | 6 +
src/compiler/Makefile.sources | 4 +-
src/compiler/nir/nir.h | 11 +-
src/compiler/nir/nir_intrinsics.h | 124 +++-
...scalar.c => nir_lower_cross_thread_to_scalar.c} | 63 +-
src/compiler/nir/nir_lower_group_reduce.c | 179 +++++
src/compiler/nir/nir_print.c | 1 +
src/compiler/spirv/GLSL.ext.AMD.h | 93 +++
src/compiler/spirv/nir_spirv.h | 2 +
src/compiler/spirv/spirv_to_nir.c | 32 +-
src/compiler/spirv/vtn_amd.c | 281 ++++++++
src/compiler/spirv/vtn_private.h | 9 +
src/intel/compiler/brw_nir.c | 2 +-
17 files changed, 1846 insertions(+), 179 deletions(-)
rename src/compiler/nir/{nir_lower_read_invocation_to_scalar.c => nir_lower_cross_thread_to_scalar.c} (56%)
create mode 100644 src/compiler/nir/nir_lower_group_reduce.c
create mode 100644 src/compiler/spirv/GLSL.ext.AMD.h
create mode 100644 src/compiler/spirv/vtn_amd.c
--
2.9.4
More information about the mesa-dev
mailing list