[Mesa-dev] [PATCH v3 00/48] nir, intel: Prerequisites for subgroups

Iago Toral itoral at igalia.com
Fri Oct 27 11:57:37 UTC 2017


I dropped a few more comments on patches 18-20, 23, 25, 29-31, 34 and
38 but nothing major in general.

I have doubts that patch 20 isn't working around some other bug and 38
scares me a little bit but I guess if Jenkins is happy I shouldn't
worry too much.

Otherwise, patches 16-38 are:

Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

(patch 15 already has your Rb)

Iago

On Thu, 2017-10-26 at 14:15 +0200, Iago Toral wrote:
> I left a few minor comments in patches 1, 2, 8 and 14. Otherwise
> patches 1-2, 4-5 and 7-14 (3 and 6 already have Rb) are:
> 
> Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
> 
> I feel like patches 10, 11 could maybe use another extra review if
> there is someone who wants to do it, since I am not very familiar
> with
> how all the indirect addressing stuff works and the restrictions of
> the
> hardware that affect this.
> 
> Patch 14 looks good, although the part where you locate the DO block
> for a matching WHILE could probably use the review of someone else
> more
> familiar with the CFG code than me.
> 
> Iago
> 
> On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > This series is a third respin of my subgroups prerequisites series
> > that
> > that I sent out a few weeks ago.  Not a whole lot has changed but
> > there are
> > some new patches.  Primarily,
> > 
> >  1) Some patches which were reviewed by Matt and Lionel were pushed
> > and are
> >     no longer in the series.  Thanks guys!
> > 
> >  2) I've applied R-B tags from various people for patches which are
> >     reviewed but depend on still unreviewed patches.
> > 
> >  3) A few patches to fix little-core.  In particular, the extra
> > little-core
> >     EU restrictions cause problems for BROADCAST, MOV_INDIRECT, and
> > integer
> >     MUL.
> > 
> > This series can be found on fd.o nere:
> > 
> > https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/subgroup
> > -p
> > rereqs-v3
> > 
> > Happy reviewing!
> > 
> > 
> > Cc: Matt Turner <mattst88 at gmail.com>
> > Cc: Francisco Jerez <currojerez at riseup.net>
> > Cc: Connor Abbott <cwabbott0 at gmail.com>
> > 
> > Francisco Jerez (1):
> >   intel/fs: Restrict live intervals to the subset possibly
> > reachable
> >     from any definition.
> > 
> > Jason Ekstrand (47):
> >   intel/fs: Pass builders instead of blocks into emit_[un]zip
> >   intel/fs: Be more explicit about our placement of [un]zip
> >   intel/fs: Use ANY/ALL32 predicates in SIMD32
> >   intel/fs: Don't stomp f0.1 in SIMD16 ballot
> >   intel/fs: Use an explicit D type for vote any/all/eq intrinsics
> >   intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
> >   intel/compiler: Add some restrictions to MOV_INDIRECT and
> > BROADCAST
> >   intel/eu: Just modify the offset in brw_broadcast
> >   intel/eu/reg: Add a subscript() helper
> >   intel/eu: Fix broadcast instruction for 64-bit values on little-
> > core
> >   intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core
> >   intel/fs: Fix integer multiplication lowering for src/dst hazards
> >   intel/fs: Use the original destination region for int MUL
> > lowering
> >   i965/fs: Extend the live ranges of VGRFs which leave loops
> >   i965/fs/nir: Simplify 64-bit store_output
> >   i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write
> >   i965/fs/nir: Minor refactor of store_output
> >   i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
> >   intel/fs: Protect opt_algebraic from OOB BROADCAST indices
> >   intel/fs: Uniformize the index in readInvocation
> >   intel/fs: Retype dest to match value in read[First]Invocation
> >   intel/fs: Assign constant locations if they haven't been assigned
> >   intel/fs: Remove min_dispatch_width from fs_visitor
> >   intel/cs: Drop max_dispatch_width checks from compile_cs
> >   intel/cs: Stop setting dispatch_grf_start_reg
> >   intel/cs: Ignore runtime_check_aads_emit for CS
> >   intel/fs: Mark 64-bit values as being contiguous
> >   intel/cs: Rework the way thread local ID is handled
> >   intel/cs: Re-run final NIR optimizations for each SIMD size
> >   intel/cs: Re-run final NIR optimizations for each SIMD size
> >   intel/cs: Push subgroup ID instead of base thread ID
> >   intel/compiler/fs: Set up subgroup invocation as a system value
> >   intel/fs: Rework zero-length URB write handling
> >   intel/eu: Make automatic exec sizes a configurable option
> >   intel/eu: Explicitly set EXECUTE_1 where needed
> >   intel/fs: Explicitly set EXECUTE_1 where needed
> >   intel/fs: Don't use automatic exec size inference
> >   nir: Add a new subgroups lowering pass
> >   nir: Add a ssa_dest_init_for_type helper
> >   nir: Make ballot intrinsics variable-size
> >   nir/lower_system_values: Lower SUBGROUP_*_MASK based on type
> >   nir/lower_subgroups: Lower ballot intrinsics to the specified bit
> > size
> >   nir,intel/compiler: Use a fixed subgroup size
> >   spirv: Add a vtn_constant_value helper
> >   spirv: Rework barriers
> >   nir: Validate base types on array dereferences
> >   compiler/nir_types: Handle vectors in glsl_get_array_element
> > 
> >  src/compiler/Makefile.sources                      |   2 +-
> >  src/compiler/glsl/glsl_to_nir.cpp                  |   1 +
> >  src/compiler/nir/nir.h                             |  25 +-
> >  src/compiler/nir/nir_intrinsics.h                  |  13 +-
> >  .../nir/nir_lower_read_invocation_to_scalar.c      | 112 ---------
> >  src/compiler/nir/nir_lower_subgroups.c             | 257
> > ++++++++++++++++++++
> >  src/compiler/nir/nir_lower_system_values.c         |   4 +-
> >  src/compiler/nir/nir_opt_intrinsics.c              |  69 +-----
> >  src/compiler/nir/nir_validate.c                    |  18 +-
> >  src/compiler/nir_types.cpp                         |   2 +
> >  src/compiler/spirv/spirv_to_nir.c                  | 132 ++++++++-
> > -
> >  src/compiler/spirv/vtn_private.h                   |   6 +
> >  src/intel/compiler/brw_compiler.c                  |   4 -
> >  src/intel/compiler/brw_compiler.h                  |   3 +-
> >  src/intel/compiler/brw_eu.c                        |   1 +
> >  src/intel/compiler/brw_eu.h                        |  10 +
> >  src/intel/compiler/brw_eu_emit.c                   |  90 +++++--
> >  src/intel/compiler/brw_fs.cpp                      | 268
> > ++++++++++++---------
> >  src/intel/compiler/brw_fs.h                        |  15 +-
> >  src/intel/compiler/brw_fs_generator.cpp            |  90 ++++---
> >  src/intel/compiler/brw_fs_live_variables.cpp       |  89 ++++++-
> >  src/intel/compiler/brw_fs_live_variables.h         |  12 +
> >  src/intel/compiler/brw_fs_nir.cpp                  | 262
> > ++++++++++++--------
> >  src/intel/compiler/brw_fs_visitor.cpp              |  78 +++---
> >  src/intel/compiler/brw_nir.c                       |  11 +-
> >  src/intel/compiler/brw_nir.h                       |   2 +-
> >  src/intel/compiler/brw_nir_lower_cs_intrinsics.c   |  56 ++---
> >  src/intel/compiler/brw_reg.h                       |  16 ++
> >  src/intel/compiler/brw_shader.cpp                  |   2 +
> >  src/intel/vulkan/anv_cmd_buffer.c                  |   6 +-
> >  src/mesa/drivers/dri/i965/gen6_constant_state.c    |   6 +-
> >  31 files changed, 1076 insertions(+), 586 deletions(-)
> >  delete mode 100644
> > src/compiler/nir/nir_lower_read_invocation_to_scalar.c
> >  create mode 100644 src/compiler/nir/nir_lower_subgroups.c


More information about the mesa-dev mailing list