[Mesa-dev] [PATCH v4 00/44] anv: SPV_KHR_16bit_storage/VK_KHR_16bit_storage for gen8+
Chema Casanova
jmcasanova at igalia.com
Wed Dec 6 00:19:16 UTC 2017
On 05/12/17 18:31, Chema Casanova wrote:
> El 05/12/17 a las 06:16, Jason Ekstrand escribió:
>> A couple of notes:
>>
>> 1) I *think* I gave you enough reviews to land the UBO/SSBO part and
>> the optimizations in 26-28. If reviews are still missing anywhere,
>> please let me know. If not, let's try and get that part landed.
>
> The series is almost ready to land, I have only pending to address your
> feedback about use untyped_read for reading vec3 ssbos.
>
> The only missing explicit R-b is that " [PATCH v4 28/44] i965/fs: Use
> untyped_surface_read for 16-bit load_ssbo" and "[PATCH v4 23/44]
> i965/fs: Enables 16-bit load_ubo with sampler" i've just answered your
> review to confirm the R-b.
>
> I expect to finish today vec3 ssbo and send the series to Jenkins before
> landing, confirm your "pending" R-b, do a last rebase to master and ask
> for a push.
I've just prepared a rebased branch with the reviewed commits ready to
land to enable VK_KHR_16bit_storage support for SSBO/UBO.
https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-v4-ubo-ssbo-to-land
As I don't have still commit access to mesa, maybe Eduardo or Alejandro
can land it for me tomorrow. But, Jason feel free to push it if you want.
Chema
>> 2) I send out a patch to rewrite assign_constant_locations which I
>> think should make it automatically handle 8 and 16-bit values as
>> well. I'd rather do that than more special casing if everything works
>> out ok.
>
> I'm testing this patch with 16-bits and make sure whatever is needed to
> have 16-bit working.
>
>>
>> 3) I sent out a series of patches to enable pushing of UBOs in
>> Vulkan. If we're not careful, these will clash with 16bit storage as
>> UBO support suddenly has to imply push constant support. That said,
>> I"m willing to wait at least a little while before landing them to let
>> us get 16bit push constant support sorted out. The UBO pushing
>> patches give us a nice little performance boost but we're nowhere near
>> a release and I don't want it blocking you.
>
> That would be my next priority, so we would only have pending to land
> the 16-bit input/output support to finish this extension.
>
> Chema
>
>> On Wed, Nov 29, 2017 at 6:07 PM, Jose Maria Casanova Crespo
>> <jmcasanova at igalia.com <mailto:jmcasanova at igalia.com>> wrote:
>>
>> Hello,
>>
>> this is the V4 series for the implementation of the
>> SPV_KHR_16bit_storage
>> and VK_KHR_16bit_storage extensions on the anv vulkan driver, in
>> addition
>> to the GLSL and NIR support needed.
>>
>> The original series can be found here [1], the following v2 [2]
>> and v3 [3].
>>
>> In short V4 includes the following:
>>
>> * Reorder the series to enable features as they are implemented,
>> the series
>> now enables first UBO and SSBO support, and then inputs/outputs and
>> finally push constants.
>> * Support the byte scattered read/write messages with different
>> bit sizes
>> byte/word/dword.
>> * Refactor of the store_ssbo code and also fix stores when
>> writemask was .yz
>> * Uses the sampler for load_ubo avoiding the initial
>> implementation of
>> the series using byte_scattered_read.
>> * Addressed all the feedback provided by Jason and Topi on v3 review.
>>
>> This series is also available at:
>>
>> https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-rc4
>> <https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-rc4>
>>
>> The objective is to start landing part of this series, all
>> feedback has been
>> addressed for SSBO and UBO. But for input/outputs features it will
>> probably
>> need another iteration as was not completely reviewed. It is also
>> needed
>> to define the approach for push constants issues before of after
>> landing
>> the support with this implementation.
>>
>> Patches 1-5 and 8-17 have already been reviewed. Patch 7 was already
>> reviewed but as it has changed too much i would appreciate another
>> review. When patches until 25 or 28 are reviewed we could land
>> UBOs and
>> SSBOs support.
>>
>> Finally an updated overview of the patches:
>>
>> Patches 1-2 add 16-bit float, int and uint types to GLSL. This is
>> needed because NIR uses GLSL types internally. We use the enums
>> already defined at AMD_gpu_shader_half_float and NV_gpu_shader
>> extensions. Patch 2 updates mesa/st, in order to avoid warnings for
>> types not handled on a switch.
>>
>> Patches 3-6 add NIR support for those new GLSL 16-bit types,
>> conversion opcodes, and rounding modes for float to half-float
>> conversions.
>>
>> Patches 7-9 add the SPIR-V (SPV_KHR_16bit_storage) to NIR support.
>>
>> Patches 10-12 add general 16-bit support for i965. This includes
>> handling of new types on several general purpose methods,
>> update/remove some asserts.
>>
>> Patches 14-17 add support for 32 to 16-bit conversions for i965,
>> including rounding mode opcodes (needed for float to half-float
>> conversions), and an optimization that removes superfluous rounding
>> mode sets.
>>
>> Patches 18-21 add and use two new messages: byte scattered read and
>> write. Those were needed because untyped surface message has a fixed
>> 32-bit write size. Those messages are used on the 16-bit support of
>> store SSBO, load SSBO and load shared.
>>
>> Patch 22 adds helpers to allow un/shuffle 16-bit components in 32-bit
>> ones. This would be needed for following optimizations on load/store
>> ssbo. This huck was originally in the input/outputs support, but
>> needed
>> a relocation because of the new order of the series.
>>
>> Patch 23 Enables load_ubo support for 16-bit using the sampler
>> un-shuffling
>> pairs 16-bit components from 32-bit.
>>
>> Patches 24-25 enable SPV_KHR_16bit_storage and
>> VK_KHR_16bit_storage but only
>> the support for SSBO and UBO on anv vulkan driver.
>>
>> Patches 26-28 were new patches included in V2 to improve performance
>> reducing the use of multiple scattered messages for untyped read/write
>> opreations. 16bit CTS tests passes without them. The other one would
>> fix a real problem using predication (patch 27), but
>> unfourtunately no CTS
>> test yet catching it.
>>
>> Patches 29-33 implement 16-bit vertex attribute inputs support on
>> i965. These include changes on anv. This was needed because 16-bit
>> surface formats do implicit conversion to 32-bit. To workaround this,
>> we override the 16-bit surface format, and use 32-bit ones. Issues
>> related
>> to robust buffer access have been addressed.
>>
>> Patch 34 implements load input and load store for all intra stage.
>> This
>> patch could have problems pointed by Jason related to how TCS
>> outputs are
>> implmemented that need more work.
>>
>> Patch 35-42 implements 16-bit store output support for fragment
>> shaders on i965. Last patch enables VK_KHR_16bit for input/outputs.
>>
>> Patch 43-44 adds 16-bit support for push constant and enables the
>> feature.
>> There is still pending to work on a general solution for push
>> constants and
>> the mixture of different bit_sizes.
>>
>> [1]
>> https://lists.freedesktop.org/archives/mesa-dev/2017-July/162791.html
>> <https://lists.freedesktop.org/archives/mesa-dev/2017-July/162791.html>
>> [2]
>> https://lists.freedesktop.org/archives/mesa-dev/2017-August/167455.html
>> <https://lists.freedesktop.org/archives/mesa-dev/2017-August/167455.html>
>> [3]
>> https://lists.freedesktop.org/archives/mesa-dev/2017-October/172557.html
>> <https://lists.freedesktop.org/archives/mesa-dev/2017-October/172557.html>
>>
>> CC: Jason Ekstrand <jason at jlekstrand.net
>> <mailto:jason at jlekstrand.net>>
>> CC: Topi Pohjolainen <topi.pohjolainen at gmail.com
>> <mailto:topi.pohjolainen at gmail.com>>
>> CC: Matt Turner <mattst88 at gmail.com <mailto:mattst88 at gmail.com>>
>>
>> Alejandro Piñeiro (12):
>> i965/vec4: Handle 16-bit types at type_size_xvec4
>> i965/fs: Remove BRW_REGISTER_TYPE_HF assert at get_exec_type
>> i965/fs: Handle 32-bit to 16-bit conversions
>> i965/fs: Define new shader opcode to set rounding modes
>> i965/fs: Enable rounding mode on f2f16 ops
>> i965/fs: Add remove_extra_rounding_modes optimization
>> i965/fs: Use byte_scattered_write on 16-bit store_ssbo
>> anv: Enable VK_KHR_16bit_storage for SSBO and UBO
>> i965/fs: Predicate byte scattered writes if needed
>> anv/pipeline: Use 32-bit surface formats for 16-bit formats
>> anv/cmd_buffer: Add a padding to the vertex buffer
>> i965/fs: Use half_precision data_format on 16-bit fb writes
>>
>> Eduardo Lima Mitev (8):
>> glsl: Add 16-bit types
>> mesa/st: Handle 16-bit types at st_glsl_storage_type_size()
>> nir: Add support for 16-bit types (half float, int16 and uint16)
>> nir: Populate conversion opcodes to 16-bit types
>> spirv/nir: Handle 16-bit types
>> spirv/nir: Add support for SPV_KHR_16bit_storage
>> anv: Enable SPV_KHR_16bit_storage on gen 8+
>> i965/fs: Optimize 16-bit SSBO stores by packing two into a
>> 32-bit reg
>>
>> Jose Maria Casanova Crespo (24):
>> nir: Add rounding modes enum
>> nir: Handle fp16 rounding modes at nir_type_conversion_op
>> spirv: Enable FPRoundingMode decorator to nir operations
>> i965: Support for 16-bit base types in helper functions
>> i965: Add support for control register
>> i965/fs: Add byte scattered write message and fs support
>> i965/fs: Add byte scattered read message and fs support
>> i965/fs: Use byte scattered read for 16-bit load_ssbo
>> i965/fs: Helpers for un/shuffle 16-bit pairs in 32-bit components
>> i965/fs: Enables 16-bit load_ubo with sampler
>> i965/fs: Use untyped_surface_read for 16-bit load_ssbo
>> compiler: Mark when input/ouput attribute at VS uses 16-bit
>> i965/compiler: includes 16-bit vertex input
>> i965/fs: Unpack 16-bit from 32-bit components in VS load_input
>> i965/fs: Support 16-bit types at load_input and store_output
>> i965/fs: Enable Render Target Write for 16-bit outputs
>> anv: Enable VK_KHR_16bit_storage for input/output
>> i965/fs: Include support for SEND data_format bit for Render Targets
>> i965/disasm: Show half-precision data_format on rt_writes
>> i965/fs: Mark 16-bit outputs on FS store_output
>> i965/fs: 16-bit source payloads always use 1 register
>> i965/fs: Enable 16-bit render target write on SKL and CHV
>> i965/fs: Support push constants of 16-bit types
>> anv: Enable VK_KHR_16bit_storage for push_constant
>>
>> src/compiler/builtin_type_macros.h | 26 ++
>> src/compiler/glsl/ast_to_hir.cpp | 3 +
>> src/compiler/glsl/glsl_to_nir.cpp | 9 +-
>> src/compiler/glsl/ir_clone.cpp | 3 +
>> src/compiler/glsl/link_uniform_initializers.cpp | 3 +
>> src/compiler/glsl/lower_buffer_access.cpp | 3 +-
>> src/compiler/glsl_types.cpp | 93 +++++-
>> src/compiler/glsl_types.h | 25 +-
>> src/compiler/nir/nir.c | 6 +
>> src/compiler/nir/nir.h | 22 +-
>> src/compiler/nir/nir_gather_info.c | 17 +-
>> src/compiler/nir/nir_opcodes.py | 11 +-
>> src/compiler/nir/nir_opcodes_c.py | 17 +-
>> src/compiler/nir/nir_split_var_copies.c | 6 +
>> src/compiler/nir_types.cpp | 18 ++
>> src/compiler/nir_types.h | 8 +
>> src/compiler/shader_info.h | 2 +
>> src/compiler/spirv/nir_spirv.h | 1 +
>> src/compiler/spirv/spirv_to_nir.c | 118 ++++++-
>> src/compiler/spirv/vtn_alu.c | 35 +-
>> src/compiler/spirv/vtn_variables.c | 21 ++
>> src/intel/compiler/brw_compiler.h | 1 +
>> src/intel/compiler/brw_disasm.c | 4 +
>> src/intel/compiler/brw_eu.h | 25 +-
>> src/intel/compiler/brw_eu_defines.h | 36 +++
>> src/intel/compiler/brw_eu_emit.c | 130 +++++++-
>> src/intel/compiler/brw_fs.cpp | 142 +++++++-
>> src/intel/compiler/brw_fs.h | 12 +
>> src/intel/compiler/brw_fs_copy_propagation.cpp | 8 +-
>> src/intel/compiler/brw_fs_generator.cpp | 30 +-
>> src/intel/compiler/brw_fs_nir.cpp | 409
>> ++++++++++++++++++++++--
>> src/intel/compiler/brw_fs_surface_builder.cpp | 23 +-
>> src/intel/compiler/brw_fs_surface_builder.h | 14 +
>> src/intel/compiler/brw_fs_visitor.cpp | 6 +
>> src/intel/compiler/brw_inst.h | 1 +
>> src/intel/compiler/brw_ir_fs.h | 3 -
>> src/intel/compiler/brw_nir.c | 16 +
>> src/intel/compiler/brw_reg.h | 6 +
>> src/intel/compiler/brw_shader.cpp | 23 ++
>> src/intel/compiler/brw_shader.h | 7 +
>> src/intel/compiler/brw_vec4.cpp | 1 +
>> src/intel/compiler/brw_vec4_generator.cpp | 3 +-
>> src/intel/compiler/brw_vec4_visitor.cpp | 3 +
>> src/intel/vulkan/anv_device.c | 23 ++
>> src/intel/vulkan/anv_extensions.py | 1 +
>> src/intel/vulkan/anv_pipeline.c | 1 +
>> src/intel/vulkan/genX_cmd_buffer.c | 20 +-
>> src/intel/vulkan/genX_pipeline.c | 34 ++
>> src/mesa/program/ir_to_mesa.cpp | 6 +
>> src/mesa/state_tracker/st_glsl_types.cpp | 3 +
>> 50 files changed, 1337 insertions(+), 101 deletions(-)
>>
>> --
>> 2.14.3
>>
>>
>>
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
More information about the mesa-dev
mailing list