[Mesa-dev] [PATCH 00/59] Initial arb_gpu_shader_fp64 support to the i965 scalar backend
Samuel Iglesias Gonsálvez
siglesias at igalia.com
Fri Apr 29 11:28:57 UTC 2016
Hello,
This patch series continues adding arb_gpu_shader_fp64 support to the
Intel driver. Specifically, this targets the i965 scalar backend for
BDW+ hardware (vec4 is still under research and gen7 has its own
issues which we intend tackle after gen8).
This adds most of the fp64 scalar implementation, it starts by enabling
the various lowering passes in NIR for doubles and then adds all the
infrastructure required in the backend to operate with 64-bit floating
point data.
For reference, this series fixes 1009 fp64 piglit tests in BDW. Fp64
totals look like this:
pass: 2523
fail: 46
crash: 447
skip: 16
total: 3032
There are a few missing things in this series to achieve a perfect fp64
pass rate:
1. Fixes to copy propagation. The fp64 code creates new code patterns
that copy-propagation isn't really ready to handle yet leading to
incorrect results in some cases. We have 9 patches to fix copy
propagation for fp64 that we intend to send separately after the
main fp64 infrastructure has been reviewed.
2. ubo/ssbo/shared-variables. We will also send the patches for this in
a separate series after this one.
3. A fix for the SIMD lowering pass to properly handle execmasking when
transposing the results of split instructions back together. We have
a local fix for this, but Curro hit the same problem while working
on SIMD32 and has a better solution for it so we intend to use his
solution when it is ready.
4. Spilling. We don't support spilling of DF registers yet and some
piglit tests need this to compile. Jason had plans to work on the
spilling code and address the needs of fp64 along the way.
The series does not introduce any regressions in piglit on ILK, SNB,
HSW, BDW and SKL.
A branch with this series is available for testing here:
$ git clone -b i965-fp64-scalar-backend-part-1 https://github.com/Igalia/mesa.git
You will have to enable the extension with:
$ export MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader_fp64
The full scalar fp64 implementation, containing also the fixes to
copy-propagation as well as ubo/ssbo and our local fix for the SIMD
lowering pass is available here:
git clone -b i965-fp64 https://github.com/Igalia/mesa.git
And for the adventurous, there is also a work-in-progress branch that
adds scalar support for HSW here:
git clone -b i965-fp64-gen7 https://github.com/Igalia/mesa.git
Thanks,
Sam
Connor Abbott (33):
i965: use double lowering pass
i965: use pack/unpackDouble lowering
i965/disasm: fix disasm of 3-src doubles
i965/eu: allow doubles in math instructions
i965: add brw_imm_df
i965: add support for getting/setting DF immediates
i965: add support for disassembling DF immediates
i965/eu: add support for DF immediates
i965: fix brw_negate_immediate() for doubles
i965: fix is_zero(), is_one() and is_negative_one() for doubles
i965: fixup uniform setup for doubles
i965/fs: print writemask_all when it's enabled
i965/fs: use the NIR bit size when creating registers
i965/fs: don't propagate 64-bit immediates
i965/fs: add support for printing double immediates
i965/fs: always pass the bitsize to brw_type_for_nir_type()
i965/fs: add a stride helper
i965/fs: add PACK opcode
i965/fs: add a pass for lowering PACK opcodes
i965/fs/nir: translate double pack/unpack
i965/fs: fix type_size() for doubles
i965/fs: handle uniforms in byte_offset()
i965/fs: use byte_offset() in offset() for uniforms
i965/fs: fix assign_constant_locations() for doubles
i965/fs: generalize SIMD16 interference workaround
i965/fs: extend exec_size halving in the generator
i965/fs: fix compares for doubles
i965/fs: fix regs_read() for uniforms
i965/fs: fix is_copy_payload() for doubles
i965/fs: fix regs_written in LOAD_PAYLOAD for doubles
i965/fs: fix dst width calculation in CSE
i965/fs: add a pass for legalizing d2f
i965/fs: add support for f2d and d2f
Iago Toral Quiroga (15):
i965: fix brw_saturate_immediate() for doubles
i965: fix brw_abs_immediate() for doubles
i965: two-argument instructions can only use 32-bit immediates
i965/fs: optimize pack double
i965/fs: optimize unpack double
i965/fs: handle fp64 opcodes in brw_do_channel_expressions
i965/fs: We only support 32-bit integer ALU operations for now
i965/fs: add null_reg_df
i965/fs: implement fsign() for doubles
i965/fs: implement d2b
i965/fs: implement d2i and d2u
i965/fs: implement i2d and u2d
i965/fs: rename our lower_d2f pass to lower_d2x
i965/fs/lower_simd_width: Fix registers written for split instructions
i965/fs: recognize writes with a subreg_offset > 0 as partial
Samuel Iglesias Gonsálvez (7):
i965: enable lrp lowering for doubles
vc4: lower lrp when operating with double operands
freedreno/ir3: lower lrp when operating with double operands
i965/fs: align access to double-based uniforms in push constant buffer
i965/fs: demote_pull_constants() did not take into account double
types
i965/fs: take into account doubles when calculating read_size for
MOV_INDIRECT
i965/fs: fix MOV_INDIRECT exec_size for doubles
Topi Pohjolainen (4):
i965: Lower DFRACEXP/DLDEXP
i965: Determine size of double precision float register
i965: Tell backend register about double precision type
i965/eu: Allow 3-src float ops with doubles
src/gallium/drivers/freedreno/ir3/ir3_nir.c | 1 +
src/gallium/drivers/vc4/vc4_program.c | 1 +
src/mesa/drivers/dri/i965/Makefile.sources | 2 +
src/mesa/drivers/dri/i965/brw_compiler.c | 2 +
src/mesa/drivers/dri/i965/brw_compiler.h | 8 +
src/mesa/drivers/dri/i965/brw_defines.h | 9 +
src/mesa/drivers/dri/i965/brw_disasm.c | 3 +-
src/mesa/drivers/dri/i965/brw_eu_emit.c | 60 +++--
src/mesa/drivers/dri/i965/brw_fs.cpp | 106 ++++++--
src/mesa/drivers/dri/i965/brw_fs.h | 6 +-
src/mesa/drivers/dri/i965/brw_fs_builder.h | 15 +-
.../dri/i965/brw_fs_channel_expressions.cpp | 23 +-
.../drivers/dri/i965/brw_fs_copy_propagation.cpp | 3 +
src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 3 +-
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 16 +-
src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp | 75 ++++++
src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp | 59 +++++
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 287 ++++++++++++++++++---
src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 67 +++--
src/mesa/drivers/dri/i965/brw_inst.h | 25 ++
src/mesa/drivers/dri/i965/brw_ir_fs.h | 14 +-
src/mesa/drivers/dri/i965/brw_link.cpp | 1 +
src/mesa/drivers/dri/i965/brw_nir.c | 10 +
src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 7 +-
src/mesa/drivers/dri/i965/brw_program.c | 1 +
src/mesa/drivers/dri/i965/brw_reg.h | 10 +
src/mesa/drivers/dri/i965/brw_shader.cpp | 73 ++++--
src/mesa/drivers/dri/i965/brw_shader.h | 1 +
src/mesa/drivers/dri/i965/brw_wm.c | 2 +
src/mesa/drivers/dri/i965/gen6_constant_state.c | 12 +-
30 files changed, 773 insertions(+), 129 deletions(-)
create mode 100644 src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp
create mode 100644 src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp
--
2.5.0
More information about the mesa-dev
mailing list