[Mesa-dev] intel: WIP: Support for using 16-bits for mediump
Topi Pohjolainen
topi.pohjolainen at gmail.com
Tue Nov 6 06:30:08 UTC 2018
Here is a version 2 of adding support for 16-bit float instructions in
the shader compiler. Unlike the first version which did all the analysis
at glsl level here one adds the notion of precision to NIR variables and
does the analysis and precision lowering in NIR level.
This lives in: gitlab.freedesktop.org:tpohjola/mesa and branch fp16.
This is now mature enough to be able to use 16-bit precision for all
instructions except a few special cases for gfxbench trex and alu2.
(Unfortunately I'm not seeing any performance benefit. This is not
that surprising as I got to the same point with the glsl-based
solution and was able to measure the performance already back then).
Hence I thought it is time to share it.
While this is still work-in-progress I didn't want to flood the list
with the full set of patches but instead included the very last where
I try to outline the logic and its current shortcomings. There is also
a short list of TODO items.
In addition to those I need to examine couple of Intel specific
misrenderings. I haven't gotten that deep yet but it looks I'm missing
something with 16-bit inot and mad/mac lowered interpolation.
Unfortunately I get corrupted rendering only with hardware while
simulator is happy.
Mostly I'm afraid how to test all of this properly. I haven't written
any unit tests but that is high on my list. This is mostly because I've
been uncertain about my design choices. So far I've used shader
runner tests that I've written for specific cases. These are useful for
development purposes but don't bring much value for regression testing.
Alejandro PiƱeiro (1):
intel/compiler/fs: Use half_precision data_format on 16-bit fb writes
Jose Maria Casanova Crespo (2):
intel/compiler/fs: Include support for RT data_format bit
intel/compiler/disasm: Show half-precision data_format on rt_writes
Topi Pohjolainen (58):
intel/compiler/fs: Set 16-bit sampler return format
intel/compiler/disasm: Show half-precision for sampler messages
intel/compiler/fs: Skip tex-inst early in conversion lowering
intel/compiler/fs: Support for dumping 16-bit IMM values
intel/compiler: Allow 16-bit math
intel/compiler/fs: Add helpers for 16-bit null regs
intel/compiler/fs: Use two SIMD8 instructions for 16-bit math
intel/compiler/fs: Use 16-bit null dest with 16-bit math
intel/compiler/fs: Use 16-bit null dest with 16-bit compare
intel/compiler/fs: Add 16-bit type support for nir_if
intel/compiler/eu: Prepare 3-src-op for 16-bit sources
intel/compiler/eu: Prepare 3-src-op for 16-bit dst
intel/compiler/eu: Allow 3-src-op with mixed precision (HF/F) sources
intel/compiler/disasm: Print mixed precision 3-src types correctly
intel/compiler/disasm: Print 16-bit IMM values
intel/compiler/fs: Support for combining 16-bit immediates
intel/compiler/fs: Set tex type for generator to flag fp16
intel/compiler/fs: Use component_size() instead of open coded
intel/compiler/fs: Add register padding support
intel/compiler/fs: Pad 16-bit texture return payloads
intel/compiler/fs: Pad 16-bit output (store/fb write) payloads
intel/compiler/fs: Pad 16-bit nir vec* components into full reg
intel/compiler/fs: Pad 16-bit nir intrinsic dest into full reg
intel/compiler/fs: Pad 16-bit const loads into full regs
intel/compiler/fs: Pad 16-bit load payload lowering
nir: Lower also 16-bit lrp() if needed
intel/compiler: Lower 16-bit lrp()
nir: Recognize f232(f216(x)) as x
nir: Recognize f216(f232(x)) as x
nir: Store variable precision when translating from glsl
glsl: Set default precision for builtin variables
i965: Prepare uniform mapping for 16-bit values
i965: Support for uploading 16-bit uniforms from 32-bit store
intel/compiler/fs: WIP: Use 32-bit slots for 16-bit uniforms
intel/compiler: Tell compiler if lower precision is supported
nir: Add lowering pass for variables marked mediump
nir: Add pass for deref precision lowering
nir: Add pass for alu precision lowering
nir: Add precision conversion for load/store_deref
nir: Add precision conversion for sources of texturing ops
nir: Don't set destination size 16 for booleans
nir: Add precision lowering for texture samples
nir: Add support for non-fixed precision
nir: Don't try to alter precision of boolean sources
nir: Add support for variable sized booleans
nir: Add support for lowering phi precision
intel/compiler/fs: Prepare alu dest type for 16-bit booleans
nir: Add lowering pass setting 16-bit boolean destinations
nir: Add lowering pass turning b2f(i2i32(x)) into b2f(x)
nir: Adjust integer precision for alus operating with 16-bit srcs
nir: Replace b2f(x) with b2f(i2i32(x)) for 16-bit x
nir: Adjust precision for discard_if
nir: Allow input varyings to be converted to lower precision
nir: Replace 16-bit src[0] for bcsel i2i32(src[0])
nir: Replace 16-bit nir_if condition with i2i32(condition)
Revert "intel/compiler: fix 16-bit comparisons"
intel/compiler: Hook in precision lowering pass
nir: Document precision lowering pass
src/compiler/Makefile.sources | 2 +
src/compiler/glsl/glsl_symbol_table.cpp | 20 +
src/compiler/glsl/glsl_symbol_table.h | 7 +
src/compiler/glsl/glsl_to_nir.cpp | 1 +
src/compiler/nir/meson.build | 2 +
src/compiler/nir/nir.h | 18 +
src/compiler/nir/nir_lower_bool_size.c | 120 +++
src/compiler/nir/nir_lower_precision.cpp | 820 ++++++++++++++++++
src/compiler/nir/nir_opt_algebraic.py | 5 +
src/intel/blorp/blorp.c | 4 +-
src/intel/compiler/brw_compiler.c | 1 +
src/intel/compiler/brw_disasm.c | 28 +-
src/intel/compiler/brw_eu.h | 3 +-
src/intel/compiler/brw_eu_emit.c | 83 +-
src/intel/compiler/brw_fs.cpp | 68 +-
src/intel/compiler/brw_fs.h | 4 +-
src/intel/compiler/brw_fs_builder.h | 37 +-
.../compiler/brw_fs_combine_constants.cpp | 84 +-
.../compiler/brw_fs_copy_propagation.cpp | 7 +-
src/intel/compiler/brw_fs_generator.cpp | 13 +-
.../compiler/brw_fs_lower_conversions.cpp | 42 +
src/intel/compiler/brw_fs_nir.cpp | 197 +++--
src/intel/compiler/brw_fs_surface_builder.cpp | 3 +-
src/intel/compiler/brw_fs_visitor.cpp | 6 +
src/intel/compiler/brw_inst.h | 5 +
src/intel/compiler/brw_ir_fs.h | 16 +
src/intel/compiler/brw_nir.c | 22 +-
src/intel/compiler/brw_nir.h | 4 +-
src/intel/compiler/brw_reg_type.c | 2 +
src/intel/compiler/brw_shader.h | 7 +
src/intel/vulkan/anv_pipeline.c | 2 +-
.../drivers/dri/i965/brw_nir_uniforms.cpp | 8 +-
src/mesa/drivers/dri/i965/brw_program.c | 10 +-
src/mesa/drivers/dri/i965/brw_program.h | 6 +-
src/mesa/drivers/dri/i965/brw_tcs.c | 2 +-
.../drivers/dri/i965/gen6_constant_state.c | 14 +-
36 files changed, 1548 insertions(+), 125 deletions(-)
create mode 100644 src/compiler/nir/nir_lower_bool_size.c
create mode 100644 src/compiler/nir/nir_lower_precision.cpp
--
2.17.1
More information about the mesa-dev
mailing list