[Mesa-dev] [PATCH 00/38] radv, ac: 16-bit and 8-bit arithmetic and 8-bit storage

Rhys Perry pendingchaos02 at gmail.com
Wed Feb 13 20:20:43 UTC 2019


Quite a bit of the patches aren't specific to a single extension as
many make code size-generic and some of the extensions intersect in
functionality.
It might still be possible to roughly order the patches by
functionality but I'm not sure if it would be very useful (possible
order in attachment). I didn't look at the actual content of the
patches when creating the attachment, this is from memory and looking
at the descriptions.
Would you like me to send out a v2 of this series doing like that?

On Tue, 12 Feb 2019 at 17:08, Samuel Pitoiset <samuel.pitoiset at gmail.com> wrote:
>
> How about splitting this series in four different parts? One for every
> extension? Is this doable without too much troubles?
>
> On 2/12/19 6:02 PM, Rhys Perry wrote:
> > It currently requires review (and possibly rebasing). Marek Olšák send
> > some feedback for a few of the patches but other than that, it hasn't
> > gotten much attention.
> >
> > Also patch 35 seems to vectorize 32-bit code which can help or hurt
> > shaders quite a bit and seems to hurt shaders overall. I'm not yet
> > sure how to solve this without removing it or changing the result of
> > LLVM's SLP vectorizer significantly.
> > IIRC enabling SLP vectorizer also uncovered a RA bug with a shader.
> >
> > I think I'll look into the issues with patch 35 again.
> >
> > On Tue, 12 Feb 2019 at 16:30, Samuel Pitoiset <samuel.pitoiset at gmail.com> wrote:
> >> What's the status of this?
> >>
> >> On 12/7/18 6:21 PM, Rhys Perry wrote:
> >>> This series add support for:
> >>> - VK_KHR_shader_float16_int8
> >>> - VK_AMD_gpu_shader_half_float
> >>> - VK_AMD_gpu_shader_int16
> >>> - VK_KHR_8bit_storage
> >>> on VI+. Half floats are currently disabled on LLVM 7 because of a bug
> >>> causing large memory usage and long (or unbounded) compilation times with
> >>> some tests.
> >>>
> >>> It depends on the follow patch series:
> >>> - https://patchwork.freedesktop.org/series/53454/
> >>> - https://patchwork.freedesktop.org/series/53602/
> >>> - https://patchwork.freedesktop.org/series/53660/
> >>>
> >>> An older version was tested on my Polaris card, but due to hardware issues
> >>> I currently can't test the latest version of the series.
> >>>
> >>> deqp-vk has no regressions and none of the newly enabled tests fail.
> >>>
> >>> Rhys Perry (38):
> >>>     ac: add various helpers for float16/int16/int8
> >>>     ac/nir: implement 8-bit push constant, ssbo and ubo loads
> >>>     ac/nir: implement 8-bit ssbo stores
> >>>     ac/nir: fix 16-bit ssbo stores
> >>>     ac/nir: implement 8-bit nir_load_const_instr
> >>>     ac/nir: implement 8-bit conversions
> >>>     ac/nir: fix 64-bit nir_op_f2f16_rtz
> >>>     ac/nir: make ac_build_clamp work on all bit sizes
> >>>     ac/nir: make ac_build_fract work on all bit sizes
> >>>     ac/nir: make ac_build_isign work on all bit sizes
> >>>     ac/nir: make ac_build_fsign work on all bit sizes
> >>>     ac/nir: make ac_build_fdiv support 16-bit floats
> >>>     ac/nir: implement half-float nir_op_frcp
> >>>     ac/nir: implement half-float nir_op_frsq
> >>>     ac/nir: implement half-float nir_op_ldexp
> >>>     radv: lower 16-bit flrp
> >>>     ac/nir: support half floats in emit_b2f
> >>>     ac/nir: make emit_b2i work on all bit sizes
> >>>     ac/nir: implement 16-bit shifts
> >>>     compiler/nir: add lowering option for 16-bit ffma
> >>>     ac/nir: implement 16-bit ac_build_ddxy
> >>>     ac/nir: implement 8 and 16 bit ac_build_readlane
> >>>     nir: make bitfield_reverse and ifind_msb work with all integers
> >>>     ac/nir: make ac_find_lsb work on all bit sizes
> >>>     ac/nir: make ac_build_umsb work on all bit sizes
> >>>     ac/nir: implement 8 and 16 bit ac_build_imsb
> >>>     ac/nir: make ac_build_bit_count work on all bit sizes
> >>>     ac/nir: make ac_build_bitfield_reverse work on all bit sizes
> >>>     ac/nir: implement 16-bit pack/unpack opcodes
> >>>     ac/nir: add 8-bit and 16-bit types to glsl_base_to_llvm_type
> >>>     ac/nir,radv: create an array of varying output types
> >>>     ac/nir: store all outputs as f32
> >>>     radv: store all fragment shader inputs as f32
> >>>     radv: handle all fragment output types
> >>>     ac,radv: run LLVM's SLP vectorizer
> >>>     ac/nir: generate better code for nir_op_f2f16_rtz
> >>>     ac/nir: have nir_op_f2f16 round to zero
> >>>     radv: expose float16, int16 and int8 features and extensions
> >>>
> >>>    src/amd/common/ac_llvm_build.c        | 355 ++++++++++++++------------
> >>>    src/amd/common/ac_llvm_build.h        |  22 +-
> >>>    src/amd/common/ac_llvm_util.c         |   9 +-
> >>>    src/amd/common/ac_llvm_util.h         |   1 +
> >>>    src/amd/common/ac_nir_to_llvm.c       | 258 +++++++++++++++----
> >>>    src/amd/common/ac_shader_abi.h        |   1 +
> >>>    src/amd/vulkan/radv_device.c          |  17 ++
> >>>    src/amd/vulkan/radv_extensions.py     |   4 +
> >>>    src/amd/vulkan/radv_nir_to_llvm.c     |  92 ++++---
> >>>    src/amd/vulkan/radv_shader.c          |   7 +
> >>>    src/broadcom/compiler/nir_to_vir.c    |   1 +
> >>>    src/compiler/nir/nir.h                |   1 +
> >>>    src/compiler/nir/nir_opcodes.py       |   4 +-
> >>>    src/compiler/nir/nir_opt_algebraic.py |   4 +-
> >>>    src/gallium/drivers/radeonsi/si_get.c |   1 +
> >>>    src/gallium/drivers/vc4/vc4_program.c |   1 +
> >>>    16 files changed, 516 insertions(+), 262 deletions(-)
> >>>
-------------- next part --------------
common:
  ac: add various helpers for float16/int16/int8
  ac/nir: make ac_build_clamp work on all bit sizes
  ac/nir: make ac_build_fract work on all bit sizes
  ac/nir: make ac_build_isign work on all bit sizes
  ac/nir: make ac_build_fsign work on all bit sizes
  ac/nir: make emit_b2i work on all bit sizes
  ac/nir: make ac_find_lsb work on all bit sizes
  ac/nir: make ac_build_umsb work on all bit sizes
  ac/nir: make ac_build_bit_count work on all bit sizes
  ac/nir: make ac_build_bitfield_reverse work on all bit sizes
  nir: make bitfield_reverse and ifind_msb work with all integers
  ac/nir: implement 16-bit ac_build_ddxy
  ac/nir: implement 8 and 16 bit ac_build_readlane
  ac/nir: fix 16-bit ssbo stores
  ac/nir,radv: create an array of varying output types
  ac/nir: store all outputs as f32
  radv: store all fragment shader inputs as f32
  radv: handle all fragment output types
VK_KHR_8bit_storage / maybe 8-bit portions of VK_KHR_shader_float16_int8:
  ac/nir: implement 8-bit push constant, ssbo and ubo loads
  ac/nir: implement 8-bit ssbo stores
  ac/nir: implement 8-bit nir_load_const_instr
  ac/nir: implement 8-bit conversions
VK_AMD_gpu_shader_half_float / fp16 potions of VK_KHR_shader_float16_int8:
  ac/nir: fix 64-bit nir_op_f2f16_rtz
  ac/nir: make ac_build_fdiv support 16-bit floats
  ac/nir: implement half-float nir_op_frcp
  ac/nir: implement half-float nir_op_frsq
  ac/nir: implement half-float nir_op_ldexp
  radv: lower 16-bit flrp
  ac/nir: support half floats in emit_b2f
  compiler/nir: add lowering option for 16-bit ffma
  ac/nir: generate better code for nir_op_f2f16_rtz
  ac/nir: have nir_op_f2f16 round to zero
  ac,radv: run LLVM's SLP vectorizer
VK_AMD_gpu_shader_int16:
  ac/nir: implement 16-bit pack/unpack opcodes
  ac/nir: implement 16-bit shifts
  ac/nir: implement 8 and 16 bit ac_build_imsb
  ac/nir: add 8-bit and 16-bit types to glsl_base_to_llvm_type
common:
  radv: expose float16, int16 and int8 features and extensions


More information about the mesa-dev mailing list