[Mesa-dev] [PATCH 00/59] Initial arb_gpu_shader_fp64 support to the i965 scalar backend

Tue May 3 10:09:33 UTC 2016

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 02/05/16 23:50, Mark Janes wrote:
> Samuel Iglesias Gonsálvez <siglesias at igalia.com> writes:
> 
>> Hello,
>> 
>> This patch series continues adding arb_gpu_shader_fp64 support to
>> the Intel driver. Specifically, this targets the  i965 scalar
>> backend for BDW+ hardware (vec4 is still under research and gen7
>> has its own issues which we intend tackle after gen8).
>> 
>> This adds most of the fp64 scalar implementation, it starts by
>> enabling the various lowering passes in NIR for doubles and then
>> adds all the infrastructure required in the backend to operate
>> with 64-bit floating point data.
>> 
>> For reference, this series fixes 1009 fp64 piglit tests in BDW.
>> Fp64 totals look like this:
>> 
>> pass:                  2523 fail:                    46 crash:
>> 447 skip:                    16 total:                 3032
>> 
>> There are a few missing things in this series to achieve a
>> perfect fp64 pass rate:
>> 
>> 1. Fixes to copy propagation. The fp64 code creates new code
>> patterns that copy-propagation isn't really ready to handle yet
>> leading to incorrect results in some cases. We have 9 patches to
>> fix copy propagation for fp64 that we intend to send separately
>> after the main fp64 infrastructure has been reviewed.
>> 
>> 2. ubo/ssbo/shared-variables. We will also send the patches for
>> this in a separate series after this one.
>> 
>> 3. A fix for the SIMD lowering pass to properly handle
>> execmasking when transposing the results of split instructions
>> back together. We have a local fix for this, but Curro hit the
>> same problem while working on SIMD32 and has a better solution
>> for it so we intend to use his solution when it is ready.
>> 
>> 4. Spilling. We don't support spilling of DF registers yet and
>> some piglit tests need this to compile. Jason had plans to work
>> on the spilling code and address the needs of fp64 along the
>> way.
>> 
>> The series does not introduce any regressions in piglit on ILK,
>> SNB, HSW, BDW and SKL.
> 
> In addition to the fp64 failures and assertions described above, I
> see the following regressions when I run piglit:
> 
> piglit.spec.arb_tessellation_shader.execution (5 tests, SKL, IVB,
> HSW, BYT, BSW, BDW)
> 

This is weird for gen < 8... Were you running them with the
MESA_EXTENSION_OVERRIDE? Because this is not supported in gen < 8.

About gen8+, see below.

> These tests give the same assertion as most of the fp64 tests: 
> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp:626: int 
> brw::type_size_vec4(const glsl_type*): Assertion `!"not reached"' 
> failed.
> 

This is expected because there is no fp64 support for vec4 backend in
this patch series.

> piglit.shaders.shadersource-no-compile (all platforms)
> 
> Fails with "Failed to link: error: linking with uncompiled shader"
> 

Comparing with the master HEAD used for part1 (c750029) this is not a
regression because it is failing there too. It seems this was fixed
recently.

> Sam, are you able to reproduce my results?
> 

About this:

piglit.spec.glsl-1_10.compiler.vector-dereference-in-dereference.frag

  asserts with "glslparsertest:
  src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp:422: virtual
  ir_visitor_status
  ir_channel_expressions_visitor::visit_leave(ir_assignment*): Assertion
  `!"should have been lowered"' failed."

This is not a regression with the reference we used (c750029) because it
fails there too.

Can you compare piglit results with c750029 as reference?

Thanks a lot!

Sam

>> A branch with this series is available for testing here:
>> 
>> $ git clone -b i965-fp64-scalar-backend-part-1
>> https://github.com/Igalia/mesa.git
>> 
>> You will have to enable the extension with:
>> 
>> $ export MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader_fp64
>> 
>> The full scalar fp64 implementation, containing also the fixes
>> to copy-propagation as well as ubo/ssbo and our local fix for the
>> SIMD lowering pass is available here:
>> 
>> git clone -b i965-fp64 https://github.com/Igalia/mesa.git
>> 
>> And for the adventurous, there is also a work-in-progress branch
>> that adds scalar support for HSW here:
>> 
>> git clone -b i965-fp64-gen7 https://github.com/Igalia/mesa.git
>> 
>> Thanks,
>> 
>> Sam
>> 
>> 
>> Connor Abbott (33): i965: use double lowering pass i965: use
>> pack/unpackDouble lowering i965/disasm: fix disasm of 3-src
>> doubles i965/eu: allow doubles in math instructions i965: add
>> brw_imm_df i965: add support for getting/setting DF immediates 
>> i965: add support for disassembling DF immediates i965/eu: add
>> support for DF immediates i965: fix brw_negate_immediate() for
>> doubles i965: fix is_zero(), is_one() and is_negative_one() for
>> doubles i965: fixup uniform setup for doubles i965/fs: print
>> writemask_all when it's enabled i965/fs: use the NIR bit size
>> when creating registers i965/fs: don't propagate 64-bit
>> immediates i965/fs: add support for printing double immediates 
>> i965/fs: always pass the bitsize to brw_type_for_nir_type() 
>> i965/fs: add a stride helper i965/fs: add PACK opcode i965/fs:
>> add a pass for lowering PACK opcodes i965/fs/nir: translate
>> double pack/unpack i965/fs: fix type_size() for doubles i965/fs:
>> handle uniforms in byte_offset() i965/fs: use byte_offset() in
>> offset() for uniforms i965/fs: fix assign_constant_locations()
>> for doubles i965/fs: generalize SIMD16 interference workaround 
>> i965/fs: extend exec_size halving in the generator i965/fs: fix
>> compares for doubles i965/fs: fix regs_read() for uniforms 
>> i965/fs: fix is_copy_payload() for doubles i965/fs: fix
>> regs_written in LOAD_PAYLOAD for doubles i965/fs: fix dst width
>> calculation in CSE i965/fs: add a pass for legalizing d2f 
>> i965/fs: add support for f2d and d2f
>> 
>> Iago Toral Quiroga (15): i965: fix brw_saturate_immediate() for
>> doubles i965: fix brw_abs_immediate() for doubles i965:
>> two-argument instructions can only use 32-bit immediates i965/fs:
>> optimize pack double i965/fs: optimize unpack double i965/fs:
>> handle fp64 opcodes in brw_do_channel_expressions i965/fs: We
>> only support 32-bit integer ALU operations for now i965/fs: add
>> null_reg_df i965/fs: implement fsign() for doubles i965/fs:
>> implement d2b i965/fs: implement d2i and d2u i965/fs: implement
>> i2d and u2d i965/fs: rename our lower_d2f pass to lower_d2x 
>> i965/fs/lower_simd_width: Fix registers written for split
>> instructions i965/fs: recognize writes with a subreg_offset > 0
>> as partial
>> 
>> Samuel Iglesias Gonsálvez (7): i965: enable lrp lowering for
>> doubles vc4: lower lrp when operating with double operands 
>> freedreno/ir3: lower lrp when operating with double operands 
>> i965/fs: align access to double-based uniforms in push constant
>> buffer i965/fs: demote_pull_constants() did not take into account
>> double types i965/fs: take into account doubles when calculating
>> read_size for MOV_INDIRECT i965/fs: fix MOV_INDIRECT exec_size
>> for doubles
>> 
>> Topi Pohjolainen (4): i965: Lower DFRACEXP/DLDEXP i965: Determine
>> size of double precision float register i965: Tell backend
>> register about double precision type i965/eu: Allow 3-src float
>> ops with doubles
>> 
>> src/gallium/drivers/freedreno/ir3/ir3_nir.c        |   1 + 
>> src/gallium/drivers/vc4/vc4_program.c              |   1 + 
>> src/mesa/drivers/dri/i965/Makefile.sources         |   2 + 
>> src/mesa/drivers/dri/i965/brw_compiler.c           |   2 + 
>> src/mesa/drivers/dri/i965/brw_compiler.h           |   8 + 
>> src/mesa/drivers/dri/i965/brw_defines.h            |   9 + 
>> src/mesa/drivers/dri/i965/brw_disasm.c             |   3 +- 
>> src/mesa/drivers/dri/i965/brw_eu_emit.c            |  60 +++-- 
>> src/mesa/drivers/dri/i965/brw_fs.cpp               | 106
>> ++++++-- src/mesa/drivers/dri/i965/brw_fs.h                 |   6
>> +- src/mesa/drivers/dri/i965/brw_fs_builder.h         |  15 +- 
>> .../dri/i965/brw_fs_channel_expressions.cpp        |  23 +- 
>> .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |   3 + 
>> src/mesa/drivers/dri/i965/brw_fs_cse.cpp           |   3 +- 
>> src/mesa/drivers/dri/i965/brw_fs_generator.cpp     |  16 +- 
>> src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp     |  75 ++++++ 
>> src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp    |  59 +++++ 
>> src/mesa/drivers/dri/i965/brw_fs_nir.cpp           | 287
>> ++++++++++++++++++--- 
>> src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |  67 +++-- 
>> src/mesa/drivers/dri/i965/brw_inst.h               |  25 ++ 
>> src/mesa/drivers/dri/i965/brw_ir_fs.h              |  14 +- 
>> src/mesa/drivers/dri/i965/brw_link.cpp             |   1 + 
>> src/mesa/drivers/dri/i965/brw_nir.c                |  10 + 
>> src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp     |   7 +- 
>> src/mesa/drivers/dri/i965/brw_program.c            |   1 + 
>> src/mesa/drivers/dri/i965/brw_reg.h                |  10 + 
>> src/mesa/drivers/dri/i965/brw_shader.cpp           |  73 ++++-- 
>> src/mesa/drivers/dri/i965/brw_shader.h             |   1 + 
>> src/mesa/drivers/dri/i965/brw_wm.c                 |   2 + 
>> src/mesa/drivers/dri/i965/gen6_constant_state.c    |  12 +- 30
>> files changed, 773 insertions(+), 129 deletions(-) create mode
>> 100644 src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp create mode
>> 100644 src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp
>> 
>> -- 2.5.0
>> 
>> _______________________________________________ mesa-dev mailing
>> list mesa-dev at lists.freedesktop.org 
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJXKHjdAAoJEH/0ujLxfcND4fEQAK/OP5lLw+/Ejx1vsn0ddAVb
yxa17GZjTG9QfdhTFGk9I8cqDQfz8Ydd/WO1/YbcYBtE38rY2QFq4fMPLtCF8V2Q
2yXjru4oGoR6pyQuqmblu27Xo6OGTzMcX7xkuxene7nx+fL1TicLlZI5znAxX+uF
gRaLJXYYt1Kbnww4KKirzZIak6K2q+miSSTG+aD0bkJriAabIGjHzLuxyJ849Z47
mL8B1Waeu+SUAibI0U2vwuESKfy7Zanb7huuf17l+bzaH6cXudixdoDpraE0xyxx
F6pqtPxzgqPEDmV6okj4dAKCNHu7d81+0sO0guF7StWzBzLFOfhTeZY3FwJEWbEV
lEVTuVUwCtirA8sVv4Ng6NARi08dd35q7UeRQ/uN245mQfRStXmzhugH+JEhGdJU
QIgweuQkqzpfjyd9ptxj26YfkO1SDvdcg2jqfO2fall336PMyRWq+R67a/pUbquO
hY8TSmo3uugREMWm+eIBdyDEdPjwrtYyU0zCqr7oLV3/2KDFdbyr5U0KAfCMu3Hz
/kfQu80l69MsjUUGU7nTjVAhhax8cdCg+i0LLPLxFMKLLkWzhawD0tDN8F//5V3i
0aOEJapaCLLBFxSrAY0MmSsLNUSbDRoty9PYorpvcyGnTGAMIDWZX9SB3jqjrVf1
SnDZF7fDigfSicd0EdyE
=RdmR
-----END PGP SIGNATURE-----