[Mesa-dev] [PATCH 00/59] Initial arb_gpu_shader_fp64 support to the i965 scalar backend
Francisco Jerez
currojerez at riseup.net
Mon May 16 17:28:28 UTC 2016
Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> On 30/04/16 09:52, Francisco Jerez wrote:
>> Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
>>
>>> Hello,
>>>
>>> This patch series continues adding arb_gpu_shader_fp64 support to the
>>> Intel driver. Specifically, this targets the i965 scalar backend for
>>> BDW+ hardware (vec4 is still under research and gen7 has its own
>>> issues which we intend tackle after gen8).
>>>
>>> This adds most of the fp64 scalar implementation, it starts by enabling
>>> the various lowering passes in NIR for doubles and then adds all the
>>> infrastructure required in the backend to operate with 64-bit floating
>>> point data.
>>>
>>> For reference, this series fixes 1009 fp64 piglit tests in BDW. Fp64
>>> totals look like this:
>>>
>>> pass: 2523
>>> fail: 46
>>> crash: 447
>>> skip: 16
>>> total: 3032
>>>
>>> There are a few missing things in this series to achieve a perfect fp64
>>> pass rate:
>>>
>>> 1. Fixes to copy propagation. The fp64 code creates new code patterns
>>> that copy-propagation isn't really ready to handle yet leading to
>>> incorrect results in some cases. We have 9 patches to fix copy
>>> propagation for fp64 that we intend to send separately after the
>>> main fp64 infrastructure has been reviewed.
>>>
>>> 2. ubo/ssbo/shared-variables. We will also send the patches for this in
>>> a separate series after this one.
>>>
>>> 3. A fix for the SIMD lowering pass to properly handle execmasking when
>>> transposing the results of split instructions back together. We have
>>> a local fix for this, but Curro hit the same problem while working
>>> on SIMD32 and has a better solution for it so we intend to use his
>>> solution when it is ready.
>>>
>>> 4. Spilling. We don't support spilling of DF registers yet and some
>>> piglit tests need this to compile. Jason had plans to work on the
>>> spilling code and address the needs of fp64 along the way.
>>>
>> I wonder if this branch here helps? It does with SIMD32 at least but
>> it's only lightly tested on master, YMMV.
>>
>> https://cgit.freedesktop.org/~currojerez/mesa/log/?h=i965-spilling-fixes
>>
>
> I have tested your branch. It fixes all the register spilling errors we
> hit in fp64 tests \o/
>
Cool, I'll send a largely cleaned up version of that series today
assuming nothing unexpected comes up.
> Sam
>
>>> The series does not introduce any regressions in piglit on ILK, SNB,
>>> HSW, BDW and SKL.
>>>
>>> A branch with this series is available for testing here:
>>>
>>> $ git clone -b i965-fp64-scalar-backend-part-1 https://github.com/Igalia/mesa.git
>>>
>>> You will have to enable the extension with:
>>>
>>> $ export MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader_fp64
>>>
>>> The full scalar fp64 implementation, containing also the fixes to
>>> copy-propagation as well as ubo/ssbo and our local fix for the SIMD
>>> lowering pass is available here:
>>>
>>> git clone -b i965-fp64 https://github.com/Igalia/mesa.git
>>>
>>> And for the adventurous, there is also a work-in-progress branch that
>>> adds scalar support for HSW here:
>>>
>>> git clone -b i965-fp64-gen7 https://github.com/Igalia/mesa.git
>>>
>>> Thanks,
>>>
>>> Sam
>>>
>>>
>>> Connor Abbott (33):
>>> i965: use double lowering pass
>>> i965: use pack/unpackDouble lowering
>>> i965/disasm: fix disasm of 3-src doubles
>>> i965/eu: allow doubles in math instructions
>>> i965: add brw_imm_df
>>> i965: add support for getting/setting DF immediates
>>> i965: add support for disassembling DF immediates
>>> i965/eu: add support for DF immediates
>>> i965: fix brw_negate_immediate() for doubles
>>> i965: fix is_zero(), is_one() and is_negative_one() for doubles
>>> i965: fixup uniform setup for doubles
>>> i965/fs: print writemask_all when it's enabled
>>> i965/fs: use the NIR bit size when creating registers
>>> i965/fs: don't propagate 64-bit immediates
>>> i965/fs: add support for printing double immediates
>>> i965/fs: always pass the bitsize to brw_type_for_nir_type()
>>> i965/fs: add a stride helper
>>> i965/fs: add PACK opcode
>>> i965/fs: add a pass for lowering PACK opcodes
>>> i965/fs/nir: translate double pack/unpack
>>> i965/fs: fix type_size() for doubles
>>> i965/fs: handle uniforms in byte_offset()
>>> i965/fs: use byte_offset() in offset() for uniforms
>>> i965/fs: fix assign_constant_locations() for doubles
>>> i965/fs: generalize SIMD16 interference workaround
>>> i965/fs: extend exec_size halving in the generator
>>> i965/fs: fix compares for doubles
>>> i965/fs: fix regs_read() for uniforms
>>> i965/fs: fix is_copy_payload() for doubles
>>> i965/fs: fix regs_written in LOAD_PAYLOAD for doubles
>>> i965/fs: fix dst width calculation in CSE
>>> i965/fs: add a pass for legalizing d2f
>>> i965/fs: add support for f2d and d2f
>>>
>>> Iago Toral Quiroga (15):
>>> i965: fix brw_saturate_immediate() for doubles
>>> i965: fix brw_abs_immediate() for doubles
>>> i965: two-argument instructions can only use 32-bit immediates
>>> i965/fs: optimize pack double
>>> i965/fs: optimize unpack double
>>> i965/fs: handle fp64 opcodes in brw_do_channel_expressions
>>> i965/fs: We only support 32-bit integer ALU operations for now
>>> i965/fs: add null_reg_df
>>> i965/fs: implement fsign() for doubles
>>> i965/fs: implement d2b
>>> i965/fs: implement d2i and d2u
>>> i965/fs: implement i2d and u2d
>>> i965/fs: rename our lower_d2f pass to lower_d2x
>>> i965/fs/lower_simd_width: Fix registers written for split instructions
>>> i965/fs: recognize writes with a subreg_offset > 0 as partial
>>>
>>> Samuel Iglesias Gonsálvez (7):
>>> i965: enable lrp lowering for doubles
>>> vc4: lower lrp when operating with double operands
>>> freedreno/ir3: lower lrp when operating with double operands
>>> i965/fs: align access to double-based uniforms in push constant buffer
>>> i965/fs: demote_pull_constants() did not take into account double
>>> types
>>> i965/fs: take into account doubles when calculating read_size for
>>> MOV_INDIRECT
>>> i965/fs: fix MOV_INDIRECT exec_size for doubles
>>>
>>> Topi Pohjolainen (4):
>>> i965: Lower DFRACEXP/DLDEXP
>>> i965: Determine size of double precision float register
>>> i965: Tell backend register about double precision type
>>> i965/eu: Allow 3-src float ops with doubles
>>>
>>> src/gallium/drivers/freedreno/ir3/ir3_nir.c | 1 +
>>> src/gallium/drivers/vc4/vc4_program.c | 1 +
>>> src/mesa/drivers/dri/i965/Makefile.sources | 2 +
>>> src/mesa/drivers/dri/i965/brw_compiler.c | 2 +
>>> src/mesa/drivers/dri/i965/brw_compiler.h | 8 +
>>> src/mesa/drivers/dri/i965/brw_defines.h | 9 +
>>> src/mesa/drivers/dri/i965/brw_disasm.c | 3 +-
>>> src/mesa/drivers/dri/i965/brw_eu_emit.c | 60 +++--
>>> src/mesa/drivers/dri/i965/brw_fs.cpp | 106 ++++++--
>>> src/mesa/drivers/dri/i965/brw_fs.h | 6 +-
>>> src/mesa/drivers/dri/i965/brw_fs_builder.h | 15 +-
>>> .../dri/i965/brw_fs_channel_expressions.cpp | 23 +-
>>> .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 3 +
>>> src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 3 +-
>>> src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 16 +-
>>> src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp | 75 ++++++
>>> src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp | 59 +++++
>>> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 287 ++++++++++++++++++---
>>> src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 67 +++--
>>> src/mesa/drivers/dri/i965/brw_inst.h | 25 ++
>>> src/mesa/drivers/dri/i965/brw_ir_fs.h | 14 +-
>>> src/mesa/drivers/dri/i965/brw_link.cpp | 1 +
>>> src/mesa/drivers/dri/i965/brw_nir.c | 10 +
>>> src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 7 +-
>>> src/mesa/drivers/dri/i965/brw_program.c | 1 +
>>> src/mesa/drivers/dri/i965/brw_reg.h | 10 +
>>> src/mesa/drivers/dri/i965/brw_shader.cpp | 73 ++++--
>>> src/mesa/drivers/dri/i965/brw_shader.h | 1 +
>>> src/mesa/drivers/dri/i965/brw_wm.c | 2 +
>>> src/mesa/drivers/dri/i965/gen6_constant_state.c | 12 +-
>>> 30 files changed, 773 insertions(+), 129 deletions(-)
>>> create mode 100644 src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp
>>> create mode 100644 src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp
>>>
>>> --
>>> 2.5.0
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160516/6499eb7a/attachment.sig>
More information about the mesa-dev
mailing list