[Mesa-dev] [PATCH 00/45] anv: SPV_KHR_16bit_storage/VK_KHR_16bit_storage for gen8+

Thu Jul 13 14:35:04 UTC 2017

Hello,

the following series adds the implementation of the
SPV_KHR_16bit_storage and VK_KHR_16bit_storage extensions on the anv
vulkan driver, in addition to the GLSL and NIR support needed. The
latter can be used as the base for future OpenGL extensions providing
support for 16-bit types.

You can find the reference for those extensions here [1][2][3].

An overview of the series:

Patch 1 is the update of vulkan and vk.xml in order to include those
extensions. There are some extensions that were renamed from KHX to
KHR, and probably they will land before this series. In that case,
this commit will not be needed.

Patches 2-3 add 16-bit float, int and uint types to GLSL. This is
needed because NIR uses GLSL types internally. We use the enums
already defined at AMD_gpu_shader_half_float and NV_gpu_shader
extensions. Patch 4 updates mesa/st, in order to avoid warnings for
types not handled on a switch.

Patches 4-7 add NIR support for those new GLSL 16-bit types,
conversion opcodes, and rounding modes for float to half-float
conversions.

Patches 8-10 add the SPIR-V (SPV_KHR_16bit_storage) to NIR support.

Patches 11-19 add general 16-bit support for i965. This includes
handling of new types on several general purpose methods,
update/remove some asserts, setting the stride to 2 on most cases
(details on each patch), and support for copy propagation.

Patches 20-24 add support for 32 to 16-bit conversions for i965,
including rounding mode opcodes (needed for float to half-float
conversions), and an optimization that removes superfluous rounding
mode sets.

Patch 25 adds 16-bit support for constant location.

Patches 26-30 add and use two new messages: byte scattered read and
write. Those were needed because untyped surface message has a fixed
32-bit write size. Those messages are used on the 16-bit support of
store SSBO, load SSBO, load UBO and load shared.

Patches 31-35 implement 16-bit vertex attribute inputs support on
i965. These include changes on anv. This was needed because 16-bit
surface formats do implicit conversion to 32-bit. To workaround this,
we override the 16-bit surface format, and use 32-bit ones.

Patches 36-42 implements 16-bit store output support for fragment
shaders on i965.

Patch 43 adds a custom optimization that helps to reduce the pressure
on the register allocator (we found some CTS tests needing several
minutes to compile).

Patches 44-45 enable both extensions on anv vulkan driver.

[1] https://github.com/KhronosGroup/Vulkan-Docs/blob/1.0/doc/specs/vulkan/appendices/VK_KHR_16bit_storage.txt
[2] https://www.khronos.org/registry/vulkan/specs/1.0-wsi_extensions/html/vkspec.html#VK_KHR_16bit_storage
[3] https://www.khronos.org/registry/spir-v/extensions/KHR/SPV_KHR_16bit_storage.html

Alejandro Piñeiro (18):
  vulkan: Update registry and headers to 1.0.54
  i965/vec4: Handle 16-bit types at type_size_xvec4
  i965/fs: Add brw_reg_type_from_bit_size utility method
  i965/fs: Remove BRW_REGISTER_TYPE_HF assert at get_exec_type
  i965/fs: Retype 16-bit/stride2 movs to UD on nir_op_vecX
  i965/fs: Need to allocate as minimum 32-bit register
  i965/fs: Update assertion on copy propagation
  i965/fs: Handle 32-bit to 16-bit conversions
  i965/fs: Define new shader opcodes to set rounding modes
  i965/fs: Enable rounding mode on f2f16 ops
  i965/fs: Add remove_extra_rounding_modes optimization
  i965/fs: Adjust type_size/type_slots on store_ssbo
  i965/fs: Use byte_scattered_write on 16-bit store_ssbo
  anv/pipeline: Use 32-bit surface formats for 16-bit formats
  anv/cmd_buffer: Add a padding to the vertex buffer
  i965/fs: Use half_precision data_format on 16-bit fb writes
  i965/fs: Add reuse_16bit_conversions_register optimization
  anv: Enable VK_KHR_16bit_storage

Eduardo Lima Mitev (8):
  glsl: Add 16-bit types
  mesa/st: Handle 16-bit types at st_glsl_attrib_type_size
  nir: Add support for 16-bit types (half float, int16 and uint16)
  nir: Populate conversion opcodes to/from 16-bit types
  spirv/nir: Add support for SPV_KHR_16bit_storage
  spirv/nir: Handle 16-bit types
  i965/fs: Handle 16-bit base types in helper functions
  anv: Enable SPV_KHR_16bit_storage on gen 8+

Jose Maria Casanova Crespo (19):
  nir: Add rounding mode enum
  nir: Handle fp16 rounding modes at nir_type_conversion_op
  spirv: Enable FPRoundingMode decorator to nir operations
  i965/nir: Support for 16-bit types
  i965/fs: Set stride 2 when dealing with 16-bit floats/ints
  i965: Add support for control register
  i965/fs: Support push constants of 16-bit types
  i965/fs: Add byte scattered write message and fs support
  i965/fs: Add byte scattered read message and fs support
  i965/fs: Use byte scattered read
  compiler: Mark when input/ouput attribute at VS uses 16-bit
  i965/compiler: includes 16-bit vertex input
  i965/fs: Unpack 16-bit from 32-bit components in VS load_input
  i965/fs: Enable Render Target Write for 16-bit outputs
  i965/fs: Include support for SEND data_format bit for Render Targets
  i965/disasm: Show half-precision data_format on rt_writes
  i965/fs: Mark 16-bit outputs on FS store_output
  i965/fs: 16-bit source payloads always use 1 register
  i965/fs: Enable 16-bit render target write on SKL and CHV

 include/vulkan/vulkan.h                         | 1284 ++++++++++++++++-------
 src/compiler/builtin_type_macros.h              |   26 +
 src/compiler/glsl/ast_to_hir.cpp                |    3 +
 src/compiler/glsl/builtin_types.cpp             |    1 +
 src/compiler/glsl/glsl_to_nir.cpp               |    3 +-
 src/compiler/glsl/ir_clone.cpp                  |    3 +
 src/compiler/glsl/link_uniform_initializers.cpp |    3 +
 src/compiler/glsl/lower_buffer_access.cpp       |   16 +-
 src/compiler/glsl_types.cpp                     |   93 +-
 src/compiler/glsl_types.h                       |   34 +-
 src/compiler/nir/nir.c                          |    6 +
 src/compiler/nir/nir.h                          |   20 +-
 src/compiler/nir/nir_gather_info.c              |   23 +-
 src/compiler/nir/nir_opcodes.py                 |   10 +-
 src/compiler/nir/nir_opcodes_c.py               |   17 +-
 src/compiler/nir/nir_split_var_copies.c         |    6 +
 src/compiler/nir_types.cpp                      |   24 +
 src/compiler/nir_types.h                        |    9 +
 src/compiler/shader_info.h                      |    2 +
 src/compiler/spirv/nir_spirv.h                  |    1 +
 src/compiler/spirv/spirv_to_nir.c               |   37 +-
 src/compiler/spirv/vtn_alu.c                    |   35 +-
 src/compiler/spirv/vtn_variables.c              |   21 +
 src/intel/compiler/brw_compiler.h               |    1 +
 src/intel/compiler/brw_disasm.c                 |    4 +
 src/intel/compiler/brw_eu.h                     |   22 +-
 src/intel/compiler/brw_eu_defines.h             |   28 +
 src/intel/compiler/brw_eu_emit.c                |  174 ++-
 src/intel/compiler/brw_fs.cpp                   |  178 +++-
 src/intel/compiler/brw_fs.h                     |    2 +
 src/intel/compiler/brw_fs_builder.h             |    2 +-
 src/intel/compiler/brw_fs_copy_propagation.cpp  |    6 +-
 src/intel/compiler/brw_fs_generator.cpp         |   21 +-
 src/intel/compiler/brw_fs_nir.cpp               |  329 +++++-
 src/intel/compiler/brw_fs_surface_builder.cpp   |   32 +-
 src/intel/compiler/brw_fs_surface_builder.h     |   14 +
 src/intel/compiler/brw_fs_visitor.cpp           |    6 +
 src/intel/compiler/brw_inst.h                   |    1 +
 src/intel/compiler/brw_ir_fs.h                  |    3 -
 src/intel/compiler/brw_nir.c                    |   16 +
 src/intel/compiler/brw_reg.h                    |    6 +
 src/intel/compiler/brw_shader.cpp               |   24 +
 src/intel/compiler/brw_shader.h                 |    7 +
 src/intel/compiler/brw_vec4.cpp                 |    1 +
 src/intel/compiler/brw_vec4_generator.cpp       |    3 +-
 src/intel/compiler/brw_vec4_visitor.cpp         |    3 +
 src/intel/vulkan/anv_device.c                   |   17 +
 src/intel/vulkan/anv_pipeline.c                 |    1 +
 src/intel/vulkan/genX_cmd_buffer.c              |   20 +-
 src/intel/vulkan/genX_pipeline.c                |   47 +
 src/mesa/program/ir_to_mesa.cpp                 |    6 +
 src/mesa/state_tracker/st_glsl_types.cpp        |    3 +
 src/vulkan/registry/vk.xml                      | 1208 ++++++++++++++-------
 53 files changed, 2995 insertions(+), 867 deletions(-)

-- 
2.9.3