[Mesa-dev] [PATCH v3 00/43] anv: SPV_KHR_16bit_storage/VK_KHR_16bit_storage for gen8+

Thu Nov 2 17:40:40 UTC 2017

El 02/11/17 a las 01:43, Jason Ekstrand escribió:
> I'm done reading for the day.  As you're working on incorporating
> feedback, I'd  like you to re-arrange things a bit so that we do
> everything required to enable VK_KHR_16bit_storage (including
> advertising the Vulkan extension string) for SSBOs and UBOs first and
> then enable it for push constants and enable it for inputs/outputs
> last.  This way we can land the most important part (UBOs and SSBOs)
> soon and the more annoying parts can get the review time that they need.

I think that is a good approach, I'll reorder the series so we can land
and enable the UBO/SSBOs without the other capabilities.

Chema

>
> On Mon, Oct 30, 2017 at 5:20 PM, Jason Ekstrand <jason at jlekstrand.net
> <mailto:jason at jlekstrand.net>> wrote:
>
>     Patches 1-5, 8-11, and 13-18 are
>
>     Reviewed-by: Jason Ekstrand <jason at jlekstrand.net
>     <mailto:jason at jlekstrand.net>>
>
>     On Mon, Oct 16, 2017 at 8:23 AM, Pohjolainen, Topi
>     <topi.pohjolainen at gmail.com <mailto:topi.pohjolainen at gmail.com>>
>     wrote:
>
>         On Mon, Oct 16, 2017 at 08:03:41AM -0700, Jason Ekstrand wrote:
>         > FYI: I'm planning to review this some time this week. 
>         Probably not today
>         > though.
>
>         Great, I was hoping you would. I'm just reading out of
>         curiosity and asking
>         random questions. Mostly trying to remind myself how compiler
>         works :) It has
>         been a while since I had anything to do with it.
>
>         >
>         > On Thu, Oct 12, 2017 at 11:37 AM, Jose Maria Casanova Crespo <
>         > jmcasanova at igalia.com <mailto:jmcasanova at igalia.com>> wrote:
>         >
>         > > Hello,
>         > >
>         > > this is the V3 series for the implementation of the
>         > > SPV_KHR_16bit_storage and VK_KHR_16bit_storage extensions
>         on the anv
>         > > vulkan driver, in addition to the GLSL and NIR support needed.
>         > >
>         > > The original series can be found here [1], and the V2 is
>         available
>         > > here [2].
>         > >
>         > > In short V3 includes the following:
>         > >
>         > >  * Updates on several patches after the review of the V2
>         series.
>         > >    This includes some squashes, and specially changes so
>         16-bit
>         > >    types are always packed, not using stride 2 by default.
>         > >    This implied a re-implementation of all
>         load_input/store_output
>         > >    intrinsics for 16-bit. New solution shuffles and unshuffles
>         > >    16-bit components in 32-bit URB write and read
>         operations. This
>         > >    saves space in the URB writes and reduces the register
>         pressure
>         > >    just using half of the space.
>         > >
>         > > * 5 patches have been removed from v2 series because now
>         we not
>         > >    assume the stride 2 for 16-bit registers. We also
>         removed the
>         > >    patch of reuse_16bit_conversion_register. The problems
>         related
>         > >    to spilling that motivate that patch were better
>         addressed by
>         > >    Curro's liveness patch.
>         > >
>         > >    i965/fs: Set stride 2 when dealing with 16-bit floats/ints
>         > >    i965/fs: Retype 16-bit/stride2 movs to UD on nir_op_vecX
>         > >    i965/fs: Need to allocate as minimum 32-bit register
>         > >    i965/fs: Update assertion on copy propagation
>         > >    i965/fs: Add reuse_16bit_conversions_register optimization
>         > >
>         > > Finally an updated overview of the patches:
>         > >
>         > > Patches 1-2 add 16-bit float, int and uint types to GLSL.
>         This is
>         > > needed because NIR uses GLSL types internally. We use the
>         enums
>         > > already defined at AMD_gpu_shader_half_float and NV_gpu_shader
>         > > extensions. Patch 4 updates mesa/st, in order to avoid
>         warnings for
>         > > types not handled on a switch.
>         > >
>         > > Patches 3-6 add NIR support for those new GLSL 16-bit types,
>         > > conversion opcodes, and rounding modes for float to half-float
>         > > conversions.
>         > >
>         > > Patches 7-9 add the SPIR-V (SPV_KHR_16bit_storage) to NIR
>         support.
>         > >
>         > > Patches 10-13 add general 16-bit support for i965. This
>         includes
>         > > handling of new types on several general purpose methods,
>         > > update/remove some asserts.
>         > >
>         > > Patches 14-18 add support for 32 to 16-bit conversions for
>         i965,
>         > > including rounding mode opcodes (needed for float to
>         half-float
>         > > conversions), and an optimization that removes superfluous
>         rounding
>         > > mode sets.
>         > >
>         > > Patch 19 adds 16-bit support for constant location.
>         > >
>         > > Patches 20-24 add and use two new messages: byte scattered
>         read and
>         > > write. Those were needed because untyped surface message
>         has a fixed
>         > > 32-bit write size. Those messages are used on the 16-bit
>         support of
>         > > store SSBO, load SSBO, load UBO and load shared.
>         > >
>         > > Patches 25-29 implement 16-bit vertex attribute inputs
>         support on
>         > > i965. These include changes on anv. This was needed
>         because 16-bit
>         > > surface formats do implicit conversion to 32-bit. To
>         workaround this,
>         > > we override the 16-bit surface format, and use 32-bit ones.
>         > >
>         > > Patch 30 implements load input and load store for all
>         intra stage.
>         > > This patch substitutes the previous simple patch i965/fs:
>         Set stride 2
>         > > when dealing with 16-bit floats/ints.
>         > >
>         > > Patch 31-37 implements 16-bit store output support for
>         fragment
>         > > shaders on i965.
>         > >
>         > > Patches 38-41 are the new patches included in V2. Three of
>         them are
>         > > improvements over V1 that doesn't fix any execution
>         problem, but they
>         > > improve performance reducing the use of multiple scattered
>         messages
>         > > for untyped read/write opreations. 16bit CTS tests passes
>         without them.
>         > > The other one would fix a real problem (patch 41), but
>         unfourtunately
>         > > no CTS test yet catching it.
>         > >
>         > > Patches 42-43 enable both extensions on anv vulkan driver.
>         > >
>         > > [1]
>         https://lists.freedesktop.org/archives/mesa-dev/2017-July/162791.html
>         <https://lists.freedesktop.org/archives/mesa-dev/2017-July/162791.html>
>         > > [2]
>         https://lists.freedesktop.org/archives/mesa-dev/2017-August/
>         <https://lists.freedesktop.org/archives/mesa-dev/2017-August/>
>         > > 167455.html
>         > >
>         > > Alejandro Piñeiro (14):
>         > >   i965/vec4: Handle 16-bit types at type_size_xvec4
>         > >   i965/fs: Add brw_reg_type_from_bit_size utility method
>         > >   i965/fs: Remove BRW_REGISTER_TYPE_HF assert at get_exec_type
>         > >   i965/fs: Handle 32-bit to 16-bit conversions
>         > >   i965/fs: Define new shader opcode to set rounding modes
>         > >   i965/fs: Enable rounding mode on f2f16 ops
>         > >   i965/fs: Add remove_extra_rounding_modes optimization
>         > >   i965/fs: Adjust type_size/type_slots on store_ssbo
>         > >   i965/fs: Use byte_scattered_write on 16-bit store_ssbo
>         > >   anv/pipeline: Use 32-bit surface formats for 16-bit formats
>         > >   anv/cmd_buffer: Add a padding to the vertex buffer
>         > >   i965/fs: Use half_precision data_format on 16-bit fb writes
>         > >   i965/fs: Predicate byte scattered writes if needed
>         > >   anv: Enable VK_KHR_16bit_storage
>         > >
>         > > Eduardo Lima Mitev (8):
>         > >   glsl: Add 16-bit types
>         > >   mesa/st: Handle 16-bit types at st_glsl_storage_type_size()
>         > >   nir: Add support for 16-bit types (half float, int16 and
>         uint16)
>         > >   nir: Populate conversion opcodes to/from 16-bit types
>         > >   spirv/nir: Handle 16-bit types
>         > >   spirv/nir: Add support for SPV_KHR_16bit_storage
>         > >   i965/fs: Optimize 16-bit SSBO stores by packing two into
>         a 32-bit reg
>         > >   anv: Enable SPV_KHR_16bit_storage on gen 8+
>         > >
>         > > Jose Maria Casanova Crespo (21):
>         > >   nir: Add rounding modes enum
>         > >   nir: Handle fp16 rounding modes at nir_type_conversion_op
>         > >   spirv: Enable FPRoundingMode decorator to nir operations
>         > >   i965: Support for 16-bit base types in helper functions
>         > >   i965: Add support for control register
>         > >   i965/fs: Support push constants of 16-bit types
>         > >   i965/fs: Add byte scattered write message and fs support
>         > >   i965/fs: Add byte scattered read message and fs support
>         > >   i965/fs: Use byte scattered read
>         > >   compiler: Mark when input/ouput attribute at VS uses 16-bit
>         > >   i965/compiler: includes 16-bit vertex input
>         > >   i965/fs: Unpack 16-bit from 32-bit components in VS
>         load_input
>         > >   i965/fs: Support 16-bit types at load_input and store_output
>         > >   i965/fs: Enable Render Target Write for 16-bit outputs
>         > >   i965/fs: Include support for SEND data_format bit for
>         Render Targets
>         > >   i965/disasm: Show half-precision data_format on rt_writes
>         > >   i965/fs: Mark 16-bit outputs on FS store_output
>         > >   i965/fs: 16-bit source payloads always use 1 register
>         > >   i965/fs: Enable 16-bit render target write on SKL and CHV
>         > >   i965/fs: Enables 16-bit load_ubo with sampler
>         > >   i965/fs: Use untyped_surface_read for 16-bit load_ssbo
>         > >
>         > >  src/compiler/builtin_type_macros.h              |  26 ++
>         > >  src/compiler/glsl/ast_to_hir.cpp                |   3 +
>         > >  src/compiler/glsl/glsl_to_nir.cpp               |   9 +-
>         > >  src/compiler/glsl/ir_clone.cpp                  |   3 +
>         > >  src/compiler/glsl/link_uniform_initializers.cpp |   3 +
>         > >  src/compiler/glsl/lower_buffer_access.cpp       |   3 +-
>         > >  src/compiler/glsl_types.cpp                     |  93 ++++-
>         > >  src/compiler/glsl_types.h                       |  34 +-
>         > >  src/compiler/nir/nir.c                          |   6 +
>         > >  src/compiler/nir/nir.h                          |  22 +-
>         > >  src/compiler/nir/nir_gather_info.c              |  23 +-
>         > >  src/compiler/nir/nir_opcodes.py                 |  10 +-
>         > >  src/compiler/nir/nir_opcodes_c.py               |  17 +-
>         > >  src/compiler/nir/nir_split_var_copies.c         |   6 +
>         > >  src/compiler/nir_types.cpp                      |  24 ++
>         > >  src/compiler/nir_types.h                        |   9 +
>         > >  src/compiler/shader_info.h                      |   2 +
>         > >  src/compiler/spirv/nir_spirv.h                  |   1 +
>         > >  src/compiler/spirv/spirv_to_nir.c               |  53 ++-
>         > >  src/compiler/spirv/vtn_alu.c                    |  34 +-
>         > >  src/compiler/spirv/vtn_variables.c              |  21 ++
>         > >  src/intel/compiler/brw_compiler.h               |   1 +
>         > >  src/intel/compiler/brw_disasm.c                 |   4 +
>         > >  src/intel/compiler/brw_eu.h                     |  23 +-
>         > >  src/intel/compiler/brw_eu_defines.h             |  36 ++
>         > >  src/intel/compiler/brw_eu_emit.c                | 188
>         +++++++++-
>         > >  src/intel/compiler/brw_fs.cpp                   | 128 ++++++-
>         > >  src/intel/compiler/brw_fs.h                     |  12 +
>         > >  src/intel/compiler/brw_fs_copy_propagation.cpp  |   8 +-
>         > >  src/intel/compiler/brw_fs_generator.cpp         |  28 +-
>         > >  src/intel/compiler/brw_fs_nir.cpp               | 458
>         > > ++++++++++++++++++++++--
>         > >  src/intel/compiler/brw_fs_surface_builder.cpp   |  32 +-
>         > >  src/intel/compiler/brw_fs_surface_builder.h     |  14 +
>         > >  src/intel/compiler/brw_fs_visitor.cpp           |   6 +
>         > >  src/intel/compiler/brw_inst.h                   |   1 +
>         > >  src/intel/compiler/brw_ir_fs.h                  |   3 -
>         > >  src/intel/compiler/brw_nir.c                    |  16 +
>         > >  src/intel/compiler/brw_reg.h                    |   6 +
>         > >  src/intel/compiler/brw_shader.cpp               |  23 ++
>         > >  src/intel/compiler/brw_shader.h                 |   7 +
>         > >  src/intel/compiler/brw_vec4.cpp                 |   1 +
>         > >  src/intel/compiler/brw_vec4_generator.cpp       |   3 +-
>         > >  src/intel/compiler/brw_vec4_visitor.cpp         |   3 +
>         > >  src/intel/vulkan/anv_device.c                   |  13 +
>         > >  src/intel/vulkan/anv_extensions.py              |   1 +
>         > >  src/intel/vulkan/anv_pipeline.c                 |   1 +
>         > >  src/intel/vulkan/genX_cmd_buffer.c              |  20 +-
>         > >  src/intel/vulkan/genX_pipeline.c                |  47 +++
>         > >  src/mesa/program/ir_to_mesa.cpp                 |   6 +
>         > >  src/mesa/state_tracker/st_glsl_types.cpp        |   3 +
>         > >  50 files changed, 1403 insertions(+), 91 deletions(-)
>         > >
>         > > --
>         > > 2.13.6
>         > >
>         > > _______________________________________________
>         > > mesa-dev mailing list
>         > > mesa-dev at lists.freedesktop.org
>         <mailto:mesa-dev at lists.freedesktop.org>
>         > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>         <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>         > >
>
>         > _______________________________________________
>         > mesa-dev mailing list
>         > mesa-dev at lists.freedesktop.org
>         <mailto:mesa-dev at lists.freedesktop.org>
>         > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>         <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>
>
>
>
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev