[Mesa-dev] [PATCH 00/15] radv: Support for VK_AMD_shader_ballot

Connor Abbott connora at valvesoftware.com
Tue Aug 8 01:32:26 UTC 2017


From: Connor Abbott <cwabbott0 at gmail.com>

This series implements VK_AMD_shader_ballot for radv. This extension
builds on VK_EXT_shader_subgroup_ballot and VK_EXT_shader_subgroup_vote
by adding a number of reductions across a subgroup (or wavefront in AMD
terminology). Previously, shaders had to use shared memory to compute,
say, the average across all threads in a workgroup, or the minimum and
maximum values across a workgroup. But that requires a lot of accesses
to LDS memory, which is (relatively) slow. This extension allows the
shader to do part of the reduction directly in registers, as long as it
stays within a single wavefront, reducing the amount of traffic to the
LDS that has to happen. It also adds a few AMD-specific instructions,
like mbcnt. To get an idea of what exactly is in the extension, and what
inclusive scan, exclusive scan, etc. mean, you can look at the GL
extension which exposes mostly the same things [1].

Why should you care? It turns out that with this extension enabled, plus
a few other AMD-specific extensions that are mostly trivial, DOOM will
take a different path that uses shaders that were tuned specifically for
AMD hardware. I haven't actually tested DOOM yet, since a few more
things need to be wired up, but it's a lot less work than this extension
and I'm sure Dave or Bas will be do it for me when they get around to it
:).

It uses a few new features of the AMDGPU LLVM backend that I just
landed, as well as one more small change that still needs review:
https://reviews.llvm.org/D34718, so it's going to require LLVM 6.0. It
also uses the DPP modifier that was only added on VI since that was
easier than using ds_swizzle (which is available on all GCN cards). It
should be possible to implement support for older cards using
ds_swizzle, but I haven't gotten to it yet. A note to those reviewing:
it might be helpful to look at the LLVM changes that this series uses,
in particular:

https://reviews.llvm.org/rL310087
https://reviews.llvm.org/rL310088
https://reviews.llvm.org/D34718

in order to get the complete picture.

This series depends on my previous series [2] to implement
VK_EXT_shader_subgroup_vote and VK_EXT_shader_subgroup_ballot, if
nothing else in order to be able to test the implementation. I think
DOOM also uses the latter two extensions. I've also based on my series
adding cross-thread semantics to NIR [3], which Jason needs to review,
since I was hoping that would land first, although with a little effort
it should be possible to land this first (it would require changing
PATCH 01 a little). The whole thing is available at:

git://people.freedesktop.org/~cwabbott0/mesa radv-amd-shader-ballot

and the LLVM branch that I've been using to test, with the one patch
added is at:

https://github.com/cwabbott0/llvm.git dpp-intrinsics-v4

I've got some Crucible tests for exercising the various different parts
of the implementation, although I didn't bother to test all the possible
combinations of reductions, since they didn't really require any special
code to implement anyways. I'll try and get that cleaned up and sent out
soon. Maybe I should just push the tests?

Finally, I'm leaving Valve soon (this week) to go back to school, and I
suspect that I won't have too much time to work on this afterwards, so
someone else will probably have to pick it up. I've been working on this
for most of the summer, since it turned out to be a way more complicated
beast to implement than I thought. It's required changes across the
entire stack, from spirv-to-nir all the way down to register allocation
in the LLVM backend.  Thankfully, though, most of the tricky LLVM
changes have landed (thanks Nicolai for reviewing!) and what's left is a
lot more straightforward. I should still be around to answer questions,
though. Whew!

[1] https://www.khronos.org/registry/OpenGL/extensions/AMD/AMD_shader_ballot.txt
[2] https://lists.freedesktop.org/archives/mesa-dev/2017-August/164903.html
[3] https://lists.freedesktop.org/archives/mesa-dev/2017-August/164898.html

Connor Abbott (15):
  nir: define intrinsics needed for AMD_shader_ballot
  spirv: import AMD extensions header
  spirv: add plumbing for SPV_AMD_shader_ballot and Groups
  nir: rename and generalize nir_lower_read_invocation_to_scalar
  nir: scalarize AMD_shader_ballot intrinsics
  radv: call nir_lower_cross_thread_to_scalar()
  nir: add a lowering pass for some cross-workgroup intrinsics
  radv: use nir_lower_group_reduce()
  ac: move ac_to_integer() and ac_to_float() to ac_llvm_build.c
  ac: remove bitcast_to_float()
  ac: fix ac_get_type_size() for doubles
  ac: add support for SPV_AMD_shader_ballot
  ac/nir: add support for SPV_AMD_shader_ballot
  radv: enable VK_AMD_shader_ballot
  ac/nir: fix saturate emission

 src/amd/common/ac_llvm_build.c                     | 783 ++++++++++++++++++++-
 src/amd/common/ac_llvm_build.h                     | 120 ++++
 src/amd/common/ac_nir_to_llvm.c                    | 300 ++++----
 src/amd/vulkan/radv_device.c                       |  15 +
 src/amd/vulkan/radv_pipeline.c                     |   6 +
 src/compiler/Makefile.sources                      |   4 +-
 src/compiler/nir/nir.h                             |  11 +-
 src/compiler/nir/nir_intrinsics.h                  | 124 +++-
 ...scalar.c => nir_lower_cross_thread_to_scalar.c} |  63 +-
 src/compiler/nir/nir_lower_group_reduce.c          | 179 +++++
 src/compiler/nir/nir_print.c                       |   1 +
 src/compiler/spirv/GLSL.ext.AMD.h                  |  93 +++
 src/compiler/spirv/nir_spirv.h                     |   2 +
 src/compiler/spirv/spirv_to_nir.c                  |  32 +-
 src/compiler/spirv/vtn_amd.c                       | 281 ++++++++
 src/compiler/spirv/vtn_private.h                   |   9 +
 src/intel/compiler/brw_nir.c                       |   2 +-
 17 files changed, 1846 insertions(+), 179 deletions(-)
 rename src/compiler/nir/{nir_lower_read_invocation_to_scalar.c => nir_lower_cross_thread_to_scalar.c} (56%)
 create mode 100644 src/compiler/nir/nir_lower_group_reduce.c
 create mode 100644 src/compiler/spirv/GLSL.ext.AMD.h
 create mode 100644 src/compiler/spirv/vtn_amd.c

-- 
2.9.4



More information about the mesa-dev mailing list