[Mesa-dev] [PATCH 00/31] nir: Use a 1-bit data type for booleans
Jason Ekstrand
jason at jlekstrand.net
Fri Nov 2 13:42:29 UTC 2018
Bump
On Mon, Oct 22, 2018 at 5:14 PM Jason Ekstrand <jason at jlekstrand.net> wrote:
> This is something that Connor and I have talked about quite a bit over the
> last couple of months. The core idea is to replace NIR's current 32-bit
> 0/-1 D3D10-style booleans with a 1-bit data type. All in all, I think it
> worked out pretty well though I really don't like the proliferation of
> 32-bit comparison opcodes we now have kicking around for i965.
>
> Why? No hardware really has a 1-bit type, right? Well, sort of... AMD
> actually uses 64-bit scalars for booleans with one bit per invocation.
> However, most hardware such as Intel uses some other larger value for
> booleans. The real benefit of 1-bit booleans and requiring a lowering pass
> is that you can do somewhat custom lowering (like AMD wants) and your
> lowering pass can always tell in an instant if a value is a boolean based
> on the bit size. As can be seen in the last patch, this makes it really
> easy to implement a bool -> float lowering pass for hardware that doesn't
> have real integers where NIR's current booleans are actually rather
> painful.
>
> On Intel, the situation is a bit more complicated. It's tempting to say
> that we have 32-bit D3D10 booleans. However, they're not really D3D10
> booleans on gen4-5 because the top 31 bits are undefined garbage and, while
> iand, ior, ixor, and inot operations work, you have to iand with 1 at the
> last minute to clear off all that garbage. Also, on all generations, a
> comparison of two N-bit values results in an N-bit boolean, not a 32-bit
> bool. This has caused the Igalia folks no end of trouble as they've been
> working on native 8 and 16-bit support. If, instead, we have a 1-bit bool
> with a lowering pass and we can lower to whatever we want, then we could
> lower to a set of comparison opcodes that return the same bit-size as they
> compare and it would match GEN hardware much better.
>
> But what about performance? Aren't there all sorts of neat tricks we can
> do with D3D10 booleans like b & 1.0f for b2f? As it turns out, not really;
> that's about the only one. There is some small advantage when optimizing
> shaders that come from D3D if your native representation of booleans
> matches that of D3D. However, penultimate patch in this series adds a few
> small optimizations that get us to actually better than we were before.
> With the entire series, shader-db on Kaby Lak looks like this:
>
> total instructions in shared programs: 15084098 -> 14988578 (-0.63%)
> instructions in affected programs: 1321114 -> 1225594 (-7.23%)
> helped: 2340
> HURT: 23
>
> total cycles in shared programs: 369790134 -> 359798399 (-2.70%)
> cycles in affected programs: 134085452 -> 124093717 (-7.45%)
> helped: 2149
> HURT: 720
>
> total loops in shared programs: 4393 -> 4393 (0.00%)
> loops in affected programs: 0 -> 0
> helped: 0
> HURT: 0
>
> total spills in shared programs: 10158 -> 10051 (-1.05%)
> spills in affected programs: 1429 -> 1322 (-7.49%)
> helped: 8
> HURT: 15
>
> total fills in shared programs: 22105 -> 21720 (-1.74%)
> fills in affected programs: 2853 -> 2468 (-13.49%)
> helped: 12
> HURT: 15
>
> How about ease of use? Are they a pain to deal with? Yes, adding support
> for 1-bit types was a bit awkward in a few places but most of it was
> dealing with all the places where we have 32-bit booleans baked into
> assumptions. Getting rid of that baking in solves the problem and also
> makes the whole IR more future-proof.
>
> All in all, I'd say I'm pretty happy with it. However, I'd like other
> people (particularly the AMD folks) to play with it a bit and verify that
> it solves their problems as well. Also, I added a lowering pass and tried
> to turn it on in everyone's driver but may not have put it in the right
> spot. Please double-check my work. For those wishing to take a look, you
> can also find the entire series on my gitlab here:
>
> https://gitlab.freedesktop.org/jekstrand/mesa/commits/review/nir-1-bit-bool
>
> Please review!
>
> --Jason
>
> Cc: Connor Abbott <cwabbott0 at gmail.com>
> Cc: Timothy Arceri <tarceri at itsqueeze.com>
> Cc: Eric Anholt <eric at anholt.net>
> Cc: Rob Clark <robdclark at gmail.com>
> Cc: Karol Herbst <karolherbst at gmail.com>
> Cc: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
> Cc: Alyssa Rosenzweig <alyssa at rosenzweig.io>
>
>
> Jason Ekstrand (31):
> nir/validate: Print when the validation failed
> nir/constant_folding: Add an unreachable to a switch
> nir/constant_folding: Use nir_src_as_bool for discard_if
> nir/builder: Add a nir_imm_true/false helpers
> nir/builder: Handle 16-bit floats in nir_imm_floatN_t
> nir/search: Use nir_builder
> nir/opt_if: Rework condition propagation
> nir/system_values: Use the bit size from the load_deref
> glsl/nir: Use i2b instead of ine for fixing UBO/SSBO booleans
> nir/prog: Use nir_bany in kill handling
> spirv: Use the right bit-size for spec constant ops
> spirv: Initialize subgroup destinations with the destination type
> intel/nir: Use the OPT macro for more passes
> nir/algebraic: Disable b2f lowering and two optimizations
> nir: Rename boolean-related opcodes to include 32 in the name
> FIXUP: nir/builder: Generate 32-bit bool opcodes transparently
> FIXUP: nir/algebraic: Remap boolean opcodes to the 32-bit variant
> FIXUP: Use 32-bit opcodes in the NIR producers and optimizations
> FIXUP: Use 32-bit opcodes in the NIR back-ends
> nir/constant_expressions: Rework boolean handling
> nir: Add support for 1-bit data types
> nir: Add 1-bit boolean opcodes
> nir: Add a bool to int32 lowering pass
> glsl,spirv: Generate 1-bit booleans
> FIXUP: Revert "Use 32-bit opcodes in the NIR producers and
> optimizations"
> FIXUP: Revert "nir/builder: Generate 32-bit bool opcodes
> transparently"
> FIXUP: nir/builder: Generate 1-bit booleans in nir_build_imm_bool
> nir/algebraic: Optimize 1-bit booleans
> nir/algebraic: Add back lower_b2f and a b2f optimization
> nir/algebraic: Add some optimizations for D3D-style booleans
> nir: Add a bool to float32 lowering pass
>
> src/amd/common/ac_nir_to_llvm.c | 30 +--
> src/amd/vulkan/radv_shader.c | 6 +-
> src/broadcom/compiler/nir_to_vir.c | 44 ++---
> src/broadcom/compiler/vir.c | 2 +
> src/compiler/Makefile.sources | 2 +
> src/compiler/glsl/glsl_to_nir.cpp | 28 +--
> src/compiler/nir/meson.build | 2 +
> src/compiler/nir/nir.c | 15 +-
> src/compiler/nir/nir.h | 36 ++--
> src/compiler/nir/nir_algebraic.py | 22 ++-
> src/compiler/nir/nir_builder.h | 51 ++++-
> src/compiler/nir/nir_constant_expressions.py | 53 ++---
> src/compiler/nir/nir_instr_set.c | 23 ++-
> src/compiler/nir/nir_lower_alu_to_scalar.c | 4 +
> src/compiler/nir/nir_lower_bool_to_float.c | 181 ++++++++++++++++++
> src/compiler/nir/nir_lower_bool_to_int32.c | 162 ++++++++++++++++
> src/compiler/nir/nir_lower_int64.c | 2 +-
> .../nir/nir_lower_load_const_to_scalar.c | 3 +
> src/compiler/nir/nir_lower_returns.c | 4 +-
> src/compiler/nir/nir_lower_subgroups.c | 2 +-
> src/compiler/nir/nir_lower_system_values.c | 1 +
> src/compiler/nir/nir_opcodes.py | 31 ++-
> src/compiler/nir/nir_opt_algebraic.py | 19 +-
> src/compiler/nir/nir_opt_constant_folding.c | 18 +-
> src/compiler/nir/nir_opt_if.c | 91 +++------
> src/compiler/nir/nir_opt_intrinsics.c | 2 +-
> src/compiler/nir/nir_opt_large_constants.c | 5 +
> src/compiler/nir/nir_print.c | 3 +
> src/compiler/nir/nir_search.c | 112 +++--------
> src/compiler/nir/nir_search.h | 9 +-
> src/compiler/nir/nir_validate.c | 16 +-
> src/compiler/nir/tests/vars_tests.cpp | 38 ++--
> src/compiler/nir_types.cpp | 2 +-
> src/compiler/nir_types.h | 4 +-
> src/compiler/spirv/spirv_to_nir.c | 23 ++-
> src/compiler/spirv/vtn_alu.c | 4 +-
> src/compiler/spirv/vtn_cfg.c | 10 +-
> src/compiler/spirv/vtn_subgroup.c | 12 +-
> src/gallium/auxiliary/nir/tgsi_to_nir.c | 20 +-
> .../drivers/freedreno/ir3/ir3_compiler_nir.c | 31 +--
> src/gallium/drivers/radeonsi/si_shader_nir.c | 2 +
> src/gallium/drivers/vc4/vc4_program.c | 50 ++---
> src/intel/compiler/brw_fs_nir.cpp | 80 ++++----
> src/intel/compiler/brw_nir.c | 16 +-
> .../brw_nir_analyze_boolean_resolves.c | 24 +--
> .../compiler/brw_nir_lower_image_load_store.c | 2 +-
> src/intel/compiler/brw_vec4_nir.cpp | 122 ++++++------
> src/intel/vulkan/anv_pipeline.c | 2 +-
> src/mesa/drivers/dri/i965/brw_program.c | 4 +-
> src/mesa/main/glspirv.c | 2 +-
> src/mesa/program/prog_to_nir.c | 2 +-
> src/mesa/state_tracker/st_glsl_to_nir.cpp | 2 +-
> 52 files changed, 934 insertions(+), 497 deletions(-)
> create mode 100644 src/compiler/nir/nir_lower_bool_to_float.c
> create mode 100644 src/compiler/nir/nir_lower_bool_to_int32.c
>
> --
> 2.19.1
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20181102/eb919aad/attachment-0001.html>
More information about the mesa-dev
mailing list