[Mesa-dev] [PATCH v3 00/48] nir, intel: Prerequisites for subgroups
Iago Toral
itoral at igalia.com
Mon Oct 30 12:57:21 UTC 2017
I posted comments in patches 39 and 43, but otherwise patches 39-48
are:
Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
Iago
On Fri, 2017-10-27 at 13:57 +0200, Iago Toral wrote:
> I dropped a few more comments on patches 18-20, 23, 25, 29-31, 34 and
> 38 but nothing major in general.
>
> I have doubts that patch 20 isn't working around some other bug and
> 38
> scares me a little bit but I guess if Jenkins is happy I shouldn't
> worry too much.
>
> Otherwise, patches 16-38 are:
>
> Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
>
> (patch 15 already has your Rb)
>
> Iago
>
> On Thu, 2017-10-26 at 14:15 +0200, Iago Toral wrote:
> > I left a few minor comments in patches 1, 2, 8 and 14. Otherwise
> > patches 1-2, 4-5 and 7-14 (3 and 6 already have Rb) are:
> >
> > Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
> >
> > I feel like patches 10, 11 could maybe use another extra review if
> > there is someone who wants to do it, since I am not very familiar
> > with
> > how all the indirect addressing stuff works and the restrictions of
> > the
> > hardware that affect this.
> >
> > Patch 14 looks good, although the part where you locate the DO
> > block
> > for a matching WHILE could probably use the review of someone else
> > more
> > familiar with the CFG code than me.
> >
> > Iago
> >
> > On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > > This series is a third respin of my subgroups prerequisites
> > > series
> > > that
> > > that I sent out a few weeks ago. Not a whole lot has changed but
> > > there are
> > > some new patches. Primarily,
> > >
> > > 1) Some patches which were reviewed by Matt and Lionel were
> > > pushed
> > > and are
> > > no longer in the series. Thanks guys!
> > >
> > > 2) I've applied R-B tags from various people for patches which
> > > are
> > > reviewed but depend on still unreviewed patches.
> > >
> > > 3) A few patches to fix little-core. In particular, the extra
> > > little-core
> > > EU restrictions cause problems for BROADCAST, MOV_INDIRECT,
> > > and
> > > integer
> > > MUL.
> > >
> > > This series can be found on fd.o nere:
> > >
> > > https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/subgro
> > > up
> > > -p
> > > rereqs-v3
> > >
> > > Happy reviewing!
> > >
> > >
> > > Cc: Matt Turner <mattst88 at gmail.com>
> > > Cc: Francisco Jerez <currojerez at riseup.net>
> > > Cc: Connor Abbott <cwabbott0 at gmail.com>
> > >
> > > Francisco Jerez (1):
> > > intel/fs: Restrict live intervals to the subset possibly
> > > reachable
> > > from any definition.
> > >
> > > Jason Ekstrand (47):
> > > intel/fs: Pass builders instead of blocks into emit_[un]zip
> > > intel/fs: Be more explicit about our placement of [un]zip
> > > intel/fs: Use ANY/ALL32 predicates in SIMD32
> > > intel/fs: Don't stomp f0.1 in SIMD16 ballot
> > > intel/fs: Use an explicit D type for vote any/all/eq intrinsics
> > > intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
> > > intel/compiler: Add some restrictions to MOV_INDIRECT and
> > > BROADCAST
> > > intel/eu: Just modify the offset in brw_broadcast
> > > intel/eu/reg: Add a subscript() helper
> > > intel/eu: Fix broadcast instruction for 64-bit values on
> > > little-
> > > core
> > > intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core
> > > intel/fs: Fix integer multiplication lowering for src/dst
> > > hazards
> > > intel/fs: Use the original destination region for int MUL
> > > lowering
> > > i965/fs: Extend the live ranges of VGRFs which leave loops
> > > i965/fs/nir: Simplify 64-bit store_output
> > > i965/fs: Return a fs_reg from
> > > shuffle_64bit_data_for_32bit_write
> > > i965/fs/nir: Minor refactor of store_output
> > > i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
> > > intel/fs: Protect opt_algebraic from OOB BROADCAST indices
> > > intel/fs: Uniformize the index in readInvocation
> > > intel/fs: Retype dest to match value in read[First]Invocation
> > > intel/fs: Assign constant locations if they haven't been
> > > assigned
> > > intel/fs: Remove min_dispatch_width from fs_visitor
> > > intel/cs: Drop max_dispatch_width checks from compile_cs
> > > intel/cs: Stop setting dispatch_grf_start_reg
> > > intel/cs: Ignore runtime_check_aads_emit for CS
> > > intel/fs: Mark 64-bit values as being contiguous
> > > intel/cs: Rework the way thread local ID is handled
> > > intel/cs: Re-run final NIR optimizations for each SIMD size
> > > intel/cs: Re-run final NIR optimizations for each SIMD size
> > > intel/cs: Push subgroup ID instead of base thread ID
> > > intel/compiler/fs: Set up subgroup invocation as a system value
> > > intel/fs: Rework zero-length URB write handling
> > > intel/eu: Make automatic exec sizes a configurable option
> > > intel/eu: Explicitly set EXECUTE_1 where needed
> > > intel/fs: Explicitly set EXECUTE_1 where needed
> > > intel/fs: Don't use automatic exec size inference
> > > nir: Add a new subgroups lowering pass
> > > nir: Add a ssa_dest_init_for_type helper
> > > nir: Make ballot intrinsics variable-size
> > > nir/lower_system_values: Lower SUBGROUP_*_MASK based on type
> > > nir/lower_subgroups: Lower ballot intrinsics to the specified
> > > bit
> > > size
> > > nir,intel/compiler: Use a fixed subgroup size
> > > spirv: Add a vtn_constant_value helper
> > > spirv: Rework barriers
> > > nir: Validate base types on array dereferences
> > > compiler/nir_types: Handle vectors in glsl_get_array_element
> > >
> > > src/compiler/Makefile.sources | 2 +-
> > > src/compiler/glsl/glsl_to_nir.cpp | 1 +
> > > src/compiler/nir/nir.h | 25 +-
> > > src/compiler/nir/nir_intrinsics.h | 13 +-
> > > .../nir/nir_lower_read_invocation_to_scalar.c | 112 -----
> > > ----
> > > src/compiler/nir/nir_lower_subgroups.c | 257
> > > ++++++++++++++++++++
> > > src/compiler/nir/nir_lower_system_values.c | 4 +-
> > > src/compiler/nir/nir_opt_intrinsics.c | 69 +-----
> > > src/compiler/nir/nir_validate.c | 18 +-
> > > src/compiler/nir_types.cpp | 2 +
> > > src/compiler/spirv/spirv_to_nir.c | 132
> > > ++++++++-
> > > -
> > > src/compiler/spirv/vtn_private.h | 6 +
> > > src/intel/compiler/brw_compiler.c | 4 -
> > > src/intel/compiler/brw_compiler.h | 3 +-
> > > src/intel/compiler/brw_eu.c | 1 +
> > > src/intel/compiler/brw_eu.h | 10 +
> > > src/intel/compiler/brw_eu_emit.c | 90 +++++--
> > > src/intel/compiler/brw_fs.cpp | 268
> > > ++++++++++++---------
> > > src/intel/compiler/brw_fs.h | 15 +-
> > > src/intel/compiler/brw_fs_generator.cpp | 90 ++++---
> > > src/intel/compiler/brw_fs_live_variables.cpp | 89 ++++++-
> > > src/intel/compiler/brw_fs_live_variables.h | 12 +
> > > src/intel/compiler/brw_fs_nir.cpp | 262
> > > ++++++++++++--------
> > > src/intel/compiler/brw_fs_visitor.cpp | 78 +++---
> > > src/intel/compiler/brw_nir.c | 11 +-
> > > src/intel/compiler/brw_nir.h | 2 +-
> > > src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 56 ++---
> > > src/intel/compiler/brw_reg.h | 16 ++
> > > src/intel/compiler/brw_shader.cpp | 2 +
> > > src/intel/vulkan/anv_cmd_buffer.c | 6 +-
> > > src/mesa/drivers/dri/i965/gen6_constant_state.c | 6 +-
> > > 31 files changed, 1076 insertions(+), 586 deletions(-)
> > > delete mode 100644
> > > src/compiler/nir/nir_lower_read_invocation_to_scalar.c
> > > create mode 100644 src/compiler/nir/nir_lower_subgroups.c
More information about the mesa-dev
mailing list