[Mesa-dev] [PATCH v2 00/52] nir, intel: Prerequisites for subgroups

Jason Ekstrand jason at jlekstrand.net
Fri Oct 13 05:47:26 UTC 2017

A little over a month ago, I sent a 44 patch series with a bunch of the
prerequisite patches for implementing SPIR-V subgroup support.  This is a
re-spin of that series with a few more patches.  Most of the new fixes are
either because of rebasing on top of my uniform reworks or are fixes for
SIMD32.  As of now, I have all but 8 of the subgroups tests passing with
SIMD32 and those 8 appear to be issues with spilling but I'm not 100% sure.

Some of the patches in here overlap a bit with stuff that Connor did in his
series for radv.  In particular, I've taken a different approach which I
like better to sorting out uint64_t vs. uvec4 for ballot intrinsics.

Cc: Matt Turner <mattst88 at gmail.com>
Cc: Francisco Jerez <currojerez at riseup.net>
Cc: Connor Abbott <cwabbott0 at gmail.com>

Alejandro PiƱeiro (1):
  i965/fs: Add brw_reg_type_from_bit_size utility method

Francisco Jerez (1):
  intel/fs: Restrict live intervals to the subset possibly reachable
    from any definition.

Jason Ekstrand (50):
  intel/fs: Pass builders instead of blocks into emit_[un]zip
  intel/fs: Be more explicit about our placement of [un]zip
  intel/fs: Handle flag read/write aliasing in needs_src_copy
  intel/fs: Use ANY/ALL32 predicates in SIMD32
  intel/fs: Don't stomp f0.1 in SIMD16 ballot
  intel/fs: Use an explicit D type for vote any/all/eq intrinsics
  intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
  i965/fs: Extend the live ranges of VGRFs which leave loops
  i965/fs/nir: Use the nir_src_bit_size helper
  i965/fs/nir: Simplify 64-bit store_output
  i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write
  i965/fs/nir: Minor refactor of store_output
  i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
  intel/fs: Protect opt_algebraic from OOB BROADCAST indices
  intel/fs: Uniformize the index in readInvocation
  intel/fs: Retype dest to match value in read[First]Invocation
  intel/fs: Assign constant locations if they haven't been assigned
  intel/fs: Remove min_dispatch_width from fs_visitor
  intel/cs: Drop min_dispatch_width checks from compile_cs
  intel/cs: Stop setting dispatch_grf_start_reg
  intel/cs: Ignore runtime_check_aads_emit for CS
  intel/fs: Mark 64-bit values as being contiguous
  intel/cs: Rework the way thread local ID is handled
  intel/cs: Re-run final NIR optimizations for each SIMD size
  intel/cs: Re-run final NIR optimizations for each SIMD size
  intel/cs: Push subgroup ID instead of base thread ID
  intel/compiler/fs: Set up subgroup invocation as a system value
  intel/fs: Rework zero-length URB write handling
  intel/eu: Use EXECUTE_1 for JMPI
  intel/eu: Make automatic exec sizes a configurable option
  intel/eu: Explicitly set EXECUTE_1 where needed
  intel/fs: Explicitly set EXECUTE_1 where needed
  intel/fs: Don't use automatic exec size inference
  anv/pipeline: Dump shader immedately after spirv_to_nir
  anv/pipeline: Drop nir_lower_clip_cull_distance_arrays
  anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir
  nir/lower_wpos_ytransform: Support system value intrinsics
  i965/program: Move nir_lower_system_values higher up
  intel/compiler: Call nir_lower_system_values in brw_preprocess_nir
  nir/opt_intrinsics: Rework progress
  nir: Add a new subgroups lowering pass
  nir: Add a ssa_dest_init_for_type helper
  nir: Make ballot intrinsics variable-size
  nir/lower_system_values: Lower SUBGROUP_*_MASK based on type
  nir/lower_subgroups: Lower ballot intrinsics to the specified bit size
  nir,intel/compiler: Use a fixed subgroup size
  spirv: Add a vtn_constant_value helper
  spirv: Rework barriers
  nir: Validate base types on array dereferences
  compiler/nir_types: Handle vectors in glsl_get_array_element

 src/compiler/Makefile.sources                      |   2 +-
 src/compiler/glsl/glsl_to_nir.cpp                  |   1 +
 src/compiler/nir/nir.h                             |  25 +-
 src/compiler/nir/nir_intrinsics.h                  |  13 +-
 .../nir/nir_lower_read_invocation_to_scalar.c      | 112 -------
 src/compiler/nir/nir_lower_subgroups.c             | 257 ++++++++++++++++
 src/compiler/nir/nir_lower_system_values.c         |   4 +-
 src/compiler/nir/nir_lower_wpos_ytransform.c       |   4 +
 src/compiler/nir/nir_opt_intrinsics.c              |  83 +----
 src/compiler/nir/nir_validate.c                    |  18 +-
 src/compiler/nir_types.cpp                         |   2 +
 src/compiler/spirv/spirv_to_nir.c                  | 132 ++++++--
 src/compiler/spirv/vtn_private.h                   |   6 +
 src/intel/compiler/brw_compiler.c                  |   4 -
 src/intel/compiler/brw_compiler.h                  |   3 +-
 src/intel/compiler/brw_eu.c                        |   1 +
 src/intel/compiler/brw_eu.h                        |  10 +
 src/intel/compiler/brw_eu_emit.c                   |  43 ++-
 src/intel/compiler/brw_fs.cpp                      | 246 +++++++++------
 src/intel/compiler/brw_fs.h                        |  15 +-
 src/intel/compiler/brw_fs_generator.cpp            |  14 +-
 src/intel/compiler/brw_fs_live_variables.cpp       |  89 +++++-
 src/intel/compiler/brw_fs_live_variables.h         |  12 +
 src/intel/compiler/brw_fs_nir.cpp                  | 337 +++++++++++++--------
 src/intel/compiler/brw_fs_visitor.cpp              |  78 +++--
 src/intel/compiler/brw_nir.c                       |  13 +-
 src/intel/compiler/brw_nir.h                       |   2 +-
 src/intel/compiler/brw_nir_lower_cs_intrinsics.c   |  56 +---
 src/intel/vulkan/anv_cmd_buffer.c                  |   6 +-
 src/intel/vulkan/anv_pipeline.c                    |  18 +-
 src/mesa/drivers/dri/i965/brw_program.c            |   1 -
 src/mesa/drivers/dri/i965/gen6_constant_state.c    |   6 +-
 32 files changed, 1051 insertions(+), 562 deletions(-)
 delete mode 100644 src/compiler/nir/nir_lower_read_invocation_to_scalar.c
 create mode 100644 src/compiler/nir/nir_lower_subgroups.c


More information about the mesa-dev mailing list