[Mesa-dev] [PATCH 00/38] radv, ac: 16-bit and 8-bit arithmetic and 8-bit storage

Tue Feb 12 17:02:11 UTC 2019

It currently requires review (and possibly rebasing). Marek Olšák send
some feedback for a few of the patches but other than that, it hasn't
gotten much attention.

Also patch 35 seems to vectorize 32-bit code which can help or hurt
shaders quite a bit and seems to hurt shaders overall. I'm not yet
sure how to solve this without removing it or changing the result of
LLVM's SLP vectorizer significantly.
IIRC enabling SLP vectorizer also uncovered a RA bug with a shader.

I think I'll look into the issues with patch 35 again.

On Tue, 12 Feb 2019 at 16:30, Samuel Pitoiset <samuel.pitoiset at gmail.com> wrote:
>
> What's the status of this?
>
> On 12/7/18 6:21 PM, Rhys Perry wrote:
> > This series add support for:
> > - VK_KHR_shader_float16_int8
> > - VK_AMD_gpu_shader_half_float
> > - VK_AMD_gpu_shader_int16
> > - VK_KHR_8bit_storage
> > on VI+. Half floats are currently disabled on LLVM 7 because of a bug
> > causing large memory usage and long (or unbounded) compilation times with
> > some tests.
> >
> > It depends on the follow patch series:
> > - https://patchwork.freedesktop.org/series/53454/
> > - https://patchwork.freedesktop.org/series/53602/
> > - https://patchwork.freedesktop.org/series/53660/
> >
> > An older version was tested on my Polaris card, but due to hardware issues
> > I currently can't test the latest version of the series.
> >
> > deqp-vk has no regressions and none of the newly enabled tests fail.
> >
> > Rhys Perry (38):
> >    ac: add various helpers for float16/int16/int8
> >    ac/nir: implement 8-bit push constant, ssbo and ubo loads
> >    ac/nir: implement 8-bit ssbo stores
> >    ac/nir: fix 16-bit ssbo stores
> >    ac/nir: implement 8-bit nir_load_const_instr
> >    ac/nir: implement 8-bit conversions
> >    ac/nir: fix 64-bit nir_op_f2f16_rtz
> >    ac/nir: make ac_build_clamp work on all bit sizes
> >    ac/nir: make ac_build_fract work on all bit sizes
> >    ac/nir: make ac_build_isign work on all bit sizes
> >    ac/nir: make ac_build_fsign work on all bit sizes
> >    ac/nir: make ac_build_fdiv support 16-bit floats
> >    ac/nir: implement half-float nir_op_frcp
> >    ac/nir: implement half-float nir_op_frsq
> >    ac/nir: implement half-float nir_op_ldexp
> >    radv: lower 16-bit flrp
> >    ac/nir: support half floats in emit_b2f
> >    ac/nir: make emit_b2i work on all bit sizes
> >    ac/nir: implement 16-bit shifts
> >    compiler/nir: add lowering option for 16-bit ffma
> >    ac/nir: implement 16-bit ac_build_ddxy
> >    ac/nir: implement 8 and 16 bit ac_build_readlane
> >    nir: make bitfield_reverse and ifind_msb work with all integers
> >    ac/nir: make ac_find_lsb work on all bit sizes
> >    ac/nir: make ac_build_umsb work on all bit sizes
> >    ac/nir: implement 8 and 16 bit ac_build_imsb
> >    ac/nir: make ac_build_bit_count work on all bit sizes
> >    ac/nir: make ac_build_bitfield_reverse work on all bit sizes
> >    ac/nir: implement 16-bit pack/unpack opcodes
> >    ac/nir: add 8-bit and 16-bit types to glsl_base_to_llvm_type
> >    ac/nir,radv: create an array of varying output types
> >    ac/nir: store all outputs as f32
> >    radv: store all fragment shader inputs as f32
> >    radv: handle all fragment output types
> >    ac,radv: run LLVM's SLP vectorizer
> >    ac/nir: generate better code for nir_op_f2f16_rtz
> >    ac/nir: have nir_op_f2f16 round to zero
> >    radv: expose float16, int16 and int8 features and extensions
> >
> >   src/amd/common/ac_llvm_build.c        | 355 ++++++++++++++------------
> >   src/amd/common/ac_llvm_build.h        |  22 +-
> >   src/amd/common/ac_llvm_util.c         |   9 +-
> >   src/amd/common/ac_llvm_util.h         |   1 +
> >   src/amd/common/ac_nir_to_llvm.c       | 258 +++++++++++++++----
> >   src/amd/common/ac_shader_abi.h        |   1 +
> >   src/amd/vulkan/radv_device.c          |  17 ++
> >   src/amd/vulkan/radv_extensions.py     |   4 +
> >   src/amd/vulkan/radv_nir_to_llvm.c     |  92 ++++---
> >   src/amd/vulkan/radv_shader.c          |   7 +
> >   src/broadcom/compiler/nir_to_vir.c    |   1 +
> >   src/compiler/nir/nir.h                |   1 +
> >   src/compiler/nir/nir_opcodes.py       |   4 +-
> >   src/compiler/nir/nir_opt_algebraic.py |   4 +-
> >   src/gallium/drivers/radeonsi/si_get.c |   1 +
> >   src/gallium/drivers/vc4/vc4_program.c |   1 +
> >   16 files changed, 516 insertions(+), 262 deletions(-)
> >