[Mesa-dev] [PATCH v3 00/48] nir, intel: Prerequisites for subgroups

Iago Toral itoral at igalia.com
Thu Oct 26 12:15:19 UTC 2017


I left a few minor comments in patches 1, 2, 8 and 14. Otherwise
patches 1-2, 4-5 and 7-14 (3 and 6 already have Rb) are:

Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

I feel like patches 10, 11 could maybe use another extra review if
there is someone who wants to do it, since I am not very familiar with
how all the indirect addressing stuff works and the restrictions of the
hardware that affect this.

Patch 14 looks good, although the part where you locate the DO block
for a matching WHILE could probably use the review of someone else more
familiar with the CFG code than me.

Iago

On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> This series is a third respin of my subgroups prerequisites series
> that
> that I sent out a few weeks ago.  Not a whole lot has changed but
> there are
> some new patches.  Primarily,
> 
>  1) Some patches which were reviewed by Matt and Lionel were pushed
> and are
>     no longer in the series.  Thanks guys!
> 
>  2) I've applied R-B tags from various people for patches which are
>     reviewed but depend on still unreviewed patches.
> 
>  3) A few patches to fix little-core.  In particular, the extra
> little-core
>     EU restrictions cause problems for BROADCAST, MOV_INDIRECT, and
> integer
>     MUL.
> 
> This series can be found on fd.o nere:
> 
> https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/subgroup-p
> rereqs-v3
> 
> Happy reviewing!
> 
> 
> Cc: Matt Turner <mattst88 at gmail.com>
> Cc: Francisco Jerez <currojerez at riseup.net>
> Cc: Connor Abbott <cwabbott0 at gmail.com>
> 
> Francisco Jerez (1):
>   intel/fs: Restrict live intervals to the subset possibly reachable
>     from any definition.
> 
> Jason Ekstrand (47):
>   intel/fs: Pass builders instead of blocks into emit_[un]zip
>   intel/fs: Be more explicit about our placement of [un]zip
>   intel/fs: Use ANY/ALL32 predicates in SIMD32
>   intel/fs: Don't stomp f0.1 in SIMD16 ballot
>   intel/fs: Use an explicit D type for vote any/all/eq intrinsics
>   intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
>   intel/compiler: Add some restrictions to MOV_INDIRECT and BROADCAST
>   intel/eu: Just modify the offset in brw_broadcast
>   intel/eu/reg: Add a subscript() helper
>   intel/eu: Fix broadcast instruction for 64-bit values on little-
> core
>   intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core
>   intel/fs: Fix integer multiplication lowering for src/dst hazards
>   intel/fs: Use the original destination region for int MUL lowering
>   i965/fs: Extend the live ranges of VGRFs which leave loops
>   i965/fs/nir: Simplify 64-bit store_output
>   i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write
>   i965/fs/nir: Minor refactor of store_output
>   i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
>   intel/fs: Protect opt_algebraic from OOB BROADCAST indices
>   intel/fs: Uniformize the index in readInvocation
>   intel/fs: Retype dest to match value in read[First]Invocation
>   intel/fs: Assign constant locations if they haven't been assigned
>   intel/fs: Remove min_dispatch_width from fs_visitor
>   intel/cs: Drop max_dispatch_width checks from compile_cs
>   intel/cs: Stop setting dispatch_grf_start_reg
>   intel/cs: Ignore runtime_check_aads_emit for CS
>   intel/fs: Mark 64-bit values as being contiguous
>   intel/cs: Rework the way thread local ID is handled
>   intel/cs: Re-run final NIR optimizations for each SIMD size
>   intel/cs: Re-run final NIR optimizations for each SIMD size
>   intel/cs: Push subgroup ID instead of base thread ID
>   intel/compiler/fs: Set up subgroup invocation as a system value
>   intel/fs: Rework zero-length URB write handling
>   intel/eu: Make automatic exec sizes a configurable option
>   intel/eu: Explicitly set EXECUTE_1 where needed
>   intel/fs: Explicitly set EXECUTE_1 where needed
>   intel/fs: Don't use automatic exec size inference
>   nir: Add a new subgroups lowering pass
>   nir: Add a ssa_dest_init_for_type helper
>   nir: Make ballot intrinsics variable-size
>   nir/lower_system_values: Lower SUBGROUP_*_MASK based on type
>   nir/lower_subgroups: Lower ballot intrinsics to the specified bit
> size
>   nir,intel/compiler: Use a fixed subgroup size
>   spirv: Add a vtn_constant_value helper
>   spirv: Rework barriers
>   nir: Validate base types on array dereferences
>   compiler/nir_types: Handle vectors in glsl_get_array_element
> 
>  src/compiler/Makefile.sources                      |   2 +-
>  src/compiler/glsl/glsl_to_nir.cpp                  |   1 +
>  src/compiler/nir/nir.h                             |  25 +-
>  src/compiler/nir/nir_intrinsics.h                  |  13 +-
>  .../nir/nir_lower_read_invocation_to_scalar.c      | 112 ---------
>  src/compiler/nir/nir_lower_subgroups.c             | 257
> ++++++++++++++++++++
>  src/compiler/nir/nir_lower_system_values.c         |   4 +-
>  src/compiler/nir/nir_opt_intrinsics.c              |  69 +-----
>  src/compiler/nir/nir_validate.c                    |  18 +-
>  src/compiler/nir_types.cpp                         |   2 +
>  src/compiler/spirv/spirv_to_nir.c                  | 132 ++++++++--
>  src/compiler/spirv/vtn_private.h                   |   6 +
>  src/intel/compiler/brw_compiler.c                  |   4 -
>  src/intel/compiler/brw_compiler.h                  |   3 +-
>  src/intel/compiler/brw_eu.c                        |   1 +
>  src/intel/compiler/brw_eu.h                        |  10 +
>  src/intel/compiler/brw_eu_emit.c                   |  90 +++++--
>  src/intel/compiler/brw_fs.cpp                      | 268
> ++++++++++++---------
>  src/intel/compiler/brw_fs.h                        |  15 +-
>  src/intel/compiler/brw_fs_generator.cpp            |  90 ++++---
>  src/intel/compiler/brw_fs_live_variables.cpp       |  89 ++++++-
>  src/intel/compiler/brw_fs_live_variables.h         |  12 +
>  src/intel/compiler/brw_fs_nir.cpp                  | 262
> ++++++++++++--------
>  src/intel/compiler/brw_fs_visitor.cpp              |  78 +++---
>  src/intel/compiler/brw_nir.c                       |  11 +-
>  src/intel/compiler/brw_nir.h                       |   2 +-
>  src/intel/compiler/brw_nir_lower_cs_intrinsics.c   |  56 ++---
>  src/intel/compiler/brw_reg.h                       |  16 ++
>  src/intel/compiler/brw_shader.cpp                  |   2 +
>  src/intel/vulkan/anv_cmd_buffer.c                  |   6 +-
>  src/mesa/drivers/dri/i965/gen6_constant_state.c    |   6 +-
>  31 files changed, 1076 insertions(+), 586 deletions(-)
>  delete mode 100644
> src/compiler/nir/nir_lower_read_invocation_to_scalar.c
>  create mode 100644 src/compiler/nir/nir_lower_subgroups.c
> 


More information about the mesa-dev mailing list