[Mesa-dev] [PATCH 00/31] i965: Scalar back-end support for SIMD32, part 1.

Francisco Jerez currojerez at riseup.net
Sat May 21 05:47:35 UTC 2016

The purpose of this series is to improve the back-end infrastructure
so that lowering of most IR instructions that are too wide to execute
natively (which is far more common than usual in SIMD32 dispatch mode)
happens semi-automatically at the IR level.

Patches 1-6 address some issues in a few optimization and lowering
passes that would otherwise lead to regressions in the following
changes of the series.  Patches 7-12 move the construction of several
messages into lower_logical_sends() so the SIMD lowering pass can deal
with them.  Patches 13-22 teach the SIMD lowering pass about a number
of additional ISA restrictions that can be enforced easily by
splitting SIMD instructions into smaller chunks.  Patches 23-31 are
mainly about removing generator code that wouldn't have worked on
SIMD32 but is no longer necessary given the infrastructure introduced
in the first part of the series.

Some of the changes from this series that remove SIMD workarounds
currently implemented in the generator could potentially be left out
at least in the initial merge at the cost of losing ARB_compute_shader
support on VLV and low-end IVB which like Gen8+ don't have enough
threads per subslice to reach the workgroup size requirement specified
by the extension in SIMD16 mode.  Some other changes like the removal
of DDY unrolling from the generator are completely optional right now
although they will eventually be required for SIMD32 fragment shader
support and they seemed like a nice clean-up.

Expect two more series of roughly the same size coming up soon-ish,
the second one will get the generator code in good shape for SIMD32,
and the third one will address some of the remaining issues of the
compiler back-end so we can start plumbing 32-wide compute shaders
through it and turn the GL 4.3 switch.

[PATCH 01/31] i965/fs: Fix byte_offset() for MRF/ARF/FIXED_GRF regs.
[PATCH 02/31] i965/fs: Generalize is_uniform() to is_periodic().
[PATCH 03/31] i965/fs: No need to unzip SIMD-periodic sources during SIMD lowering.
[PATCH 04/31] i965/fs: Handle instruction predication in SIMD lowering pass.
[PATCH 05/31] i965/fs: Fix CSE temporary copy for some LOAD_PAYLOAD corner cases.
[PATCH 06/31] i965/fs: Avoid constant propagation when the type sizes don't match.
[PATCH 07/31] i965/fs: Hide varying pull constant load message setup behind logical opcode.
[PATCH 08/31] i965/fs: Implement promotion of varying pull loads on Gen4 during SIMD lowering.
[PATCH 09/31] i965/fs: Rename Gen4 physical varying pull constant load opcode.
[PATCH 10/31] i965/fs: Add missing get_latency_gen7() cases for the Gen7 pull constant opcodes.
[PATCH 11/31] i965/fs: Lower math into Gen4-5 send-like instructions in lower_logical_sends.
[PATCH 12/31] i965/fs: Handle SAMPLEINFO consistently like other texturing instructions.
[PATCH 13/31] i965/fs: Enforce extended math exec size limits during SIMD lowering.
[PATCH 14/31] i965/fs: Enforce common regioning restrictions by SIMD splitting.
[PATCH 15/31] i965/fs: Implement workaround for IVB CMP dependency race in the SIMD lowering pass.
[PATCH 16/31] i965/fs: Implement HSW BFI exec size workarounds in the SIMD lowering pass.
[PATCH 17/31] i965/fs: Assert that IF instruction with embedded compare has legal exec_size.
[PATCH 18/31] i965/fs: Calculate maximum execution size of MOV_INDIRECT correctly.
[PATCH 19/31] i965/fs: Apply usual FPU-like execution size restrictions to MULH.
[PATCH 20/31] i965/fs: Lower DDY instructions to SIMD8 during SIMD lowering time
[PATCH 21/31] i965/fs: Lower LOAD_PAYLOAD instructions of unsupported width.
[PATCH 22/31] i965/fs: Limit SIMD width of various virtual opcodes to the maximum supported value.
[PATCH 23/31] i965/fs: Remove handcrafted math SIMD lowering from the generator.
[PATCH 24/31] i965/fs: Set default access mode to Align1 for all instructions in the generator.
[PATCH 25/31] i965/fs: Drop lowering code for a few three-source instructions from the generator.
[PATCH 26/31] i965/fs: Drop Gen7 CMP SIMD unrolling workaround from the generator.
[PATCH 27/31] i965/fs: Remove manual unrolling of BFI instructions from the generator.
[PATCH 28/31] i965/fs: Remove manual splitting of DDY ops in the generator.
[PATCH 29/31] i965: Define brw_int_type() helper.
[PATCH 30/31] i965/fs: Remove extract virtual opcodes.
[PATCH 31/31] i965/fs: Remove FS_OPCODE_PACK_STENCIL_REF virtual instruction.

More information about the mesa-dev mailing list