[Mesa-dev] [PATCH 000/123] Reintroducing NIR, a new IR for mesa

Tue Dec 16 21:26:19 PST 2014

Oh, I forgot... patch 20 should probably get cherry-picked as well.

On Tue, Dec 16, 2014 at 6:01 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>
>
> On Tue, Dec 16, 2014 at 2:52 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>>
>> Hi,
>>
>> On Tue, Dec 16, 2014 at 1:04 AM, Jason Ekstrand <jason at jlekstrand.net>
>> wrote:
>> > NIR (pronounced "ner") is a new IR (internal representation) for the
>> > Mesa
>> > shader compiler that will sit between the old IR (GLSL IR) and back-end
>> > compilers.  The primary purpose of NIR is to be more efficient for doing
>> > optimizations and generate better code for the back-ends.  We have a lot
>> > of
>> > optimizations implemented in GLSL IR right now.  However, they still
>> > generate fairly bad code primarily because its tree-based structure
>> > makes
>> > writing good optimizations difficult.  For this reason, we have
>> > implemented
>> > a lot of optimizations in the i965 back-end compilers just to fix up the
>> > code we get from GLSL IR.  The "proper fix" to this is to implement a
>> > better high-level IR; enter NIR.
>> >
>> > Most of the initial work on NIR including setting up common data
>> > structures, helper methods, and a few basic passes was by Connor Abbot
>> > who
>> > interned with us over the summer.  Connor did a fantastic job, but there
>> > is
>> > still a lot left to be done.  I've spent the last two months trying to
>> > fill
>> > in the pieces that we need in order to get NIR off the ground.  At this
>> > point, we now have compitent in and out of SSA passes, are at zero
>> > piglit
>> > regressions for i965 SIMD8 fragment shaders, and the shader-db numbers
>> > aren't terrible.
>> >
>> > This is still a bit experimental.  I have been testing only on HSW but
>> > it
>> > should work ok on SNB and later.  Eventually, once we get booleans fixed
>> > up, it should work fine on older chips as well.  It also doesn't yet
>> > support SIMD16, so performance won't be that great.  That said, I think
>> > we
>> > are at the point now where we should try and land this and I can stop
>> > developing in my masive private branch.  Since this isn't quite ready
>> > for
>> > prime-time yet, using it requires setting the INTEL_USE_NIR environment
>> > variable.
>> >
>> > A few key points about NIR:
>> >
>> >  1. It is primarily an SSA-based IR.
>> >  2. It supports source/destination-modifiers and swizzles/*write-masks.
>> >  3. Standard GPU operations such as sin() and fmad() are first-class ALU
>> >     operations, not intrinsics.
>> >  4. GLSL concepts like inputs, outputs, uniforms, etc. are built into
>> > the
>> >     IR so we can do proper analysis on them.
>> >  5. Even though it's SSA, it still has a concept of registers and
>> >     write-masks in the core IR data structures.  This means we can
>> > generate
>> >     code that is much closer to what backends want.
>> >  6. Control flow is structured explicitly in the IR.
>> >
>> > (*write-masks are not available for SSA values)
>> >
>> > While source/destination modifiers and writemasks/swizzles are not
>> > particularly useful for optimizations, having them represented in the IR
>> > gives us the ability to generate more useful code for backends.
>> >
>> > A few notes about review:
>> >
>> >  1. For those of you who aren't interested in the general compiler, I'm
>> >     sorry for the patch-bomb.  However, several people have requsted
>> > that
>> >     we maintain the history of the NIR development since connor's
>> > original
>> >     drop at the end of the summer.  Therefore, while I've squashed
>> > several
>> >     things, I've tried to leave the diff of what I've done more-or-less
>> >     preserved.
>> >
>> >  2. No, this is not LLVM.  There was a long-winded discussion about that
>> >     when Connor dropped his patches that went a whole lot of nowhere as
>> >     usual.  I would really prefer if we left that debate alone.  If
>> > there
>> >     must be bikeshedding on the topic, please do so on the cover-letter
>> >     e-mail.
>> >
>> >  3. Please keep all bikeshedding about C++, typedefs, etc.  on the core
>> >     datastructures e-mail.  If we need, we can split that off in its own
>> >     thread.
>> >
>> >  4. While I welcome review, I don't plan to make non-trivial changes to
>> >     specific patches or squash anything beyond what has already been
>> >     squashed.  I've tried thus far to more-or-less keep the history and
>> > I'd
>> >     like to continue this if we can.
>>
>> I know you've said this, but I think there might still be some benefit
>> from re-arranging a few things. In particular, I think patches 21, 36,
>> 39, 59, and 65 should probably get put first so that we can push them
>> + patch 1 right away (with appropriate review), since they're not
>> NIR-specific. I've reviewed the ones I feel qualified to review (and
>> that I didn't write!) to help with this. I know I got feedback on a
>> few of those prep patches that we should wait to commit them until the
>> things they introduce have users, but I think that since there are now
>> patches in the list and we want to land them soon-ish it might be a
>> good idea to commit them earlier in order to reduce the size of this
>> patch-bomb :) Feel free to disagree, though...
>
>
> I'm totally OK with cherry-picking and pushing those early.  What I don't
> want is a bunch of "patch 34 and 76 should get squashed except for this one
> hunk which should go in 52".  Unless, of course, I really did make a
> nonsense rebasing error.  Splitting patches would probably be ok if needed
> though.
>
>> Connor
>>
>> >
>> >  5. Eric Anholt has also written NIR -> TGSI -> NIR passes which will
>> >     hopefully get landed soon after NIR initially lands.  Exactly how
>> > that
>> >     all gets hooked up for other gallium drivers beyond vc4 is outside
>> > the
>> >     scope of this series.
>> >
>> > I have pushed a branch to my personal freedesktop.org account.  For
>> > certain
>> > types of review, it may be easier to look at the end result rather than
>> > the
>> > patches.  The branch can be found via freedesktop cgit here:
>> >
>> > http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/nir-v1
>> >
>> > Last week, I did a presentation for some of the other Intel people to
>> > try
>> > and help bring them up to speed on NIR concepts quickly.  As part of
>> > this,
>> > I typed up a bunch of notes that provide a decent overview of a lot of
>> > NIR
>> > concepts.  Those notes can be found here:
>> >
>> > http://www.jlekstrand.net/jason/projects/mesa/nir-notes/
>> >
>> > Happy reviewing!
>> >
>> > P.S. Connor, Don't do too much reviewing before your finals are done.
>> > :-P
>> >
>> > Connor Abbott (22):
>> >   exec_list: add a list_foreach_typed_reverse() macro
>> >   nir: add initial README
>> >   nir: add a simple C wrapper around glsl_types.h
>> >   nir: add the core datastructures
>> >   nir: add core helper functions
>> >   nir: add a printer
>> >   nir: add a validation pass
>> >   nir: add a glsl-to-nir pass
>> >   nir: add a pass to lower variables for scalar backends
>> >   nir: keep track of the number of input, output, and uniform slots
>> >   nir: add a pass to remove unused variables
>> >   nir: add a pass to lower sampler instructions
>> >   nir: add a pass to lower system value reads
>> >   nir: add a pass to lower atomics
>> >   nir: add an optimization to turn global registers into local registers
>> >   nir: calculate dominance information
>> >   nir: add a pass to convert to SSA
>> >   nir: add an SSA-based copy propagation pass
>> >   nir: add an SSA-based dead code elimination pass
>> >   i965/fs: make emit_fragcoord_interpolation() more general
>> >   i965/fs: Don't pass through the coordinate type
>> >   i965/fs: add a NIR frontend
>> >
>> > Jason Ekstrand (101):
>> >   i965/fs: Only use nir for 8-wide non-fast-clear shaders.
>> >   i965/fs_nir: Make the sampler register always unsigned
>> >   i965/fs_nir: Use the correct types for texture inputs
>> >   i965/fs_nir: Use the correct texture offset immediate
>> >   Fix what I think are a few NIR typos
>> >   Fix up varying pull constants
>> >   i965/fs_nir: Add support for sample_pos and sample_id
>> >   nir/glsl: Add support for saturate
>> >   nir: Add fine and coarse derivative opcodes
>> >   nir/glsl: Add support for coarse and fine derivatives
>> >   i965/fs_nir: Handle coarse/fine derivatives
>> >   nir/lower_atomics: Multiply array offsets by ATOMIC_COUNTER_SIZE
>> >   i965/fs_nir: Add atomic counters support
>> >   i965/fs: Allow reinterpretation in constant propagation
>> >   nir: Add NIR_TRUE and NIR_FALSE constants and use them for boolean
>> >     immediates
>> >   nir: Add intrinsics to do alternate interpolation on inputs
>> >   i965/fs: Don't take an ir_variable for emit_general_interpolation
>> >   i965/fs_nir: Don't duplicate emit_general_interpolation
>> >   nir: Add a naieve from-SSA pass
>> >   nir: Add a lower_vec_to_movs pass
>> >   i965/fs_nir: Convert the shader to/from SSA
>> >   nir/lower_variables_scalar: Silence a compiler warning
>> >   nir: Add a basic metadata management system
>> >   nir: Add an assert
>> >   nir/foreach_block: Return false if the callback on the last block
>> >     fails
>> >   nir: Add a foreach_block_reverse function
>> >   nir: Add a function to detect if a block is immediately followed by an
>> >     if
>> >   nir: Make the nir_index_* functions return the nuber of items
>> >   nir: Add an SSA-based liveness analysis pass.
>> >   nir: Add an initialization function for SSA definitions
>> >   nir: Automatically handle SSA uses when an instruction is inserted
>> >   nir: Add a function for rewriting all the uses of a SSA def
>> >   nir: Add a parallel copy instruction type
>> >   nir: Add a function for comparing two sources
>> >   nir: Add a better out-of-SSA pass
>> >   i965/fs_nir: Do retyping for ALU srouces in get_nir_alu_src
>> >   glsl/list: Fix the exec_list_validate function
>> >   nir: Validate all lists in the validator
>> >   nir/print: Don't reindex things
>> >   nir: Differentiate between signed and unsigned versions of find_msb
>> >   i965/fs_nir: Validate optimization passes
>> >   nir/nir: Fix a bug in move_successors
>> >   glsl/list: Add a foreach_list_typed_safe_reverse macro
>> >   nir/nir: Use safe iterators when iterating over the CFG
>> >   nir/nir: Patch up phi predecessors in move_successors
>> >   nir: Add a peephole select optimization
>> >   i965/fs_nir: Turn on the peephole select optimization
>> >   nir: Validate that the SSA def and register indices are unique
>> >   nir: Add a fused multiply-add peephole
>> >   nir: Add a basic CSE pass
>> >   i965/fs_nir: Add the CSE pass and actually run in a loop
>> >   i965/fs_nir: Use an array rather than a hash table for register lookup
>> >   i965/fs_nir: Handle SSA constants
>> >   i965/fs_nir: Properly saturate multiplies
>> >   nir: Add a helper for rewriting an instruction source
>> >   nir/lower_samplers: Use the nir_instr_rewrite_src function
>> >   nir: Clean up nir_deref helper functions
>> >   nir: Make array deref direct vs. indirect an enum
>> >   nir: Add a concept of a wildcard array dereference
>> >   nir: Use an integer index for specifying structure fields
>> >   nir: Don't require a function in ssa_def_init
>> >   nir/copy_propagate: Don't cause size mismatches on phi node sources
>> >   nir: Validate that the sources of a phi have the same size as the
>> >     destination
>> >   nir/glsl: Don't allocate a state_slots array for 0 state slots
>> >   i965/fs_nir: Don't dump the shader.
>> >   nir: Use the enum for the variable mode
>> >   nir: Automatically update SSA if uses
>> >   nir: Add a copy splitting pass
>> >   nir: Add a pass to lower local variable accesses to SSA values
>> >   nir: Add a pass to lower local variables to registers
>> >   nir: Add a pass for lowering input/output loads/stores
>> >   nir: Add a pass to lower global variables to local variables
>> >   nir/glsl: Generate SSA NIR
>> >   i965/fs_nir: Use the new variable lowering code
>> >   nir/validate: Ensure that outputs are write-only and inputs are
>> >     read-only
>> >   nir: Remove the old variable lowering code
>> >   nir: Vectorize intrinsics
>> >   nir/validate: Validate intrinsic source/destination sizes
>> >   nir: Add gpu_shader5 interpolation intrinsics
>> >   nir/glsl: Add support for gpu_shader5 interpolation instrinsics
>> >   nir: Add a helper for getting a constant value from an SSA source
>> >   i965/fs_nir: Add a has_indirect flag and clean up some of the
>> >     input/output code
>> >   i965/fs_nir: Implement the ARB_gpu_shader5 interpolation intrinsics
>> >   nir: Add neg, abs, and sat opcodes
>> >   nir: Add a lowering pass for adding source modifiers where possible
>> >   nir: Make the type casting operations static inline functions
>> >   nir/glsl: Emit abs, neg, and sat operations instead of source
>> >     modifiers
>> >   nir: Add an expression matching framework
>> >   nir: Add infastructure for generating algebraic transformation passes
>> >   nir: Add an algebraic optimization pass
>> >   nir: Add a basic constant folding pass
>> >   nir: Remove the ffma peephole
>> >   nir: Make texture instruction names more consistent
>> >   nir: Constant fold array indirects
>> >   nir: Use a source for uniform buffer indices instead of an index
>> >   nir: Add a sampler index indirect to nir_tex_instr
>> >   nir: Rework the way samplers are lowered
>> >   i965/fs_nir: Add support for indirect texture arrays
>> >   nir/metadata: Rename metadata_dirty to metadata_preserve
>> >   nir: Call nir_metadata_preserve more places
>> >   nir: Make bcsel a fully vector operation
>> >
>> >  src/glsl/Makefile.am                               |   10 +-
>> >  src/glsl/Makefile.sources                          |   39 +-
>> >  src/glsl/list.h                                    |   19 +-
>> >  src/glsl/nir/README                                |  118 ++
>> >  src/glsl/nir/glsl_to_nir.cpp                       | 1825
>> > +++++++++++++++++
>> >  src/glsl/nir/glsl_to_nir.h                         |   40 +
>> >  src/glsl/nir/nir.c                                 | 2042
>> > ++++++++++++++++++++
>> >  src/glsl/nir/nir.h                                 | 1433
>> > ++++++++++++++
>> >  src/glsl/nir/nir_algebraic.py                      |  249 +++
>> >  src/glsl/nir/nir_dominance.c                       |  298 +++
>> >  src/glsl/nir/nir_from_ssa.c                        |  859 ++++++++
>> >  src/glsl/nir/nir_intrinsics.c                      |   49 +
>> >  src/glsl/nir/nir_intrinsics.h                      |  140 ++
>> >  src/glsl/nir/nir_live_variables.c                  |  282 +++
>> >  src/glsl/nir/nir_lower_atomics.c                   |  146 ++
>> >  src/glsl/nir/nir_lower_global_vars_to_local.c      |  107 +
>> >  src/glsl/nir/nir_lower_io.c                        |  324 ++++
>> >  src/glsl/nir/nir_lower_locals_to_regs.c            |  308 +++
>> >  src/glsl/nir/nir_lower_samplers.cpp                |  181 ++
>> >  src/glsl/nir/nir_lower_system_values.c             |  107 +
>> >  src/glsl/nir/nir_lower_to_source_mods.c            |  181 ++
>> >  src/glsl/nir/nir_lower_variables.c                 | 1046 ++++++++++
>> >  src/glsl/nir/nir_lower_vec_to_movs.c               |   96 +
>> >  src/glsl/nir/nir_metadata.c                        |   54 +
>> >  src/glsl/nir/nir_opcodes.c                         |   46 +
>> >  src/glsl/nir/nir_opcodes.h                         |  356 ++++
>> >  src/glsl/nir/nir_opt_algebraic.py                  |   67 +
>> >  src/glsl/nir/nir_opt_constant_folding.c            |  355 ++++
>> >  src/glsl/nir/nir_opt_copy_propagate.c              |  325 ++++
>> >  src/glsl/nir/nir_opt_cse.c                         |  269 +++
>> >  src/glsl/nir/nir_opt_dce.c                         |  186 ++
>> >  src/glsl/nir/nir_opt_global_to_local.c             |  103 +
>> >  src/glsl/nir/nir_opt_peephole_select.c             |  214 ++
>> >  src/glsl/nir/nir_print.c                           |  948 +++++++++
>> >  src/glsl/nir/nir_remove_dead_variables.c           |  138 ++
>> >  src/glsl/nir/nir_search.c                          |  337 ++++
>> >  src/glsl/nir/nir_search.h                          |   80 +
>> >  src/glsl/nir/nir_split_var_copies.c                |  225 +++
>> >  src/glsl/nir/nir_to_ssa.c                          |  660 +++++++
>> >  src/glsl/nir/nir_types.cpp                         |  143 ++
>> >  src/glsl/nir/nir_types.h                           |   75 +
>> >  src/glsl/nir/nir_validate.c                        |  912 +++++++++
>> >  src/mesa/drivers/dri/i965/Makefile.sources         |    1 +
>> >  src/mesa/drivers/dri/i965/brw_fs.cpp               |   74 +-
>> >  src/mesa/drivers/dri/i965/brw_fs.h                 |   57 +-
>> >  .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |    4 +-
>> >  src/mesa/drivers/dri/i965/brw_fs_fp.cpp            |   32 +-
>> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp           | 1778
>> > +++++++++++++++++
>> >  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp       |   39 +-
>> >  src/mesa/main/bitset.h                             |    1 +
>> >  50 files changed, 17301 insertions(+), 77 deletions(-)
>> >  create mode 100644 src/glsl/nir/README
>> >  create mode 100644 src/glsl/nir/glsl_to_nir.cpp
>> >  create mode 100644 src/glsl/nir/glsl_to_nir.h
>> >  create mode 100644 src/glsl/nir/nir.c
>> >  create mode 100644 src/glsl/nir/nir.h
>> >  create mode 100644 src/glsl/nir/nir_algebraic.py
>> >  create mode 100644 src/glsl/nir/nir_dominance.c
>> >  create mode 100644 src/glsl/nir/nir_from_ssa.c
>> >  create mode 100644 src/glsl/nir/nir_intrinsics.c
>> >  create mode 100644 src/glsl/nir/nir_intrinsics.h
>> >  create mode 100644 src/glsl/nir/nir_live_variables.c
>> >  create mode 100644 src/glsl/nir/nir_lower_atomics.c
>> >  create mode 100644 src/glsl/nir/nir_lower_global_vars_to_local.c
>> >  create mode 100644 src/glsl/nir/nir_lower_io.c
>> >  create mode 100644 src/glsl/nir/nir_lower_locals_to_regs.c
>> >  create mode 100644 src/glsl/nir/nir_lower_samplers.cpp
>> >  create mode 100644 src/glsl/nir/nir_lower_system_values.c
>> >  create mode 100644 src/glsl/nir/nir_lower_to_source_mods.c
>> >  create mode 100644 src/glsl/nir/nir_lower_variables.c
>> >  create mode 100644 src/glsl/nir/nir_lower_vec_to_movs.c
>> >  create mode 100644 src/glsl/nir/nir_metadata.c
>> >  create mode 100644 src/glsl/nir/nir_opcodes.c
>> >  create mode 100644 src/glsl/nir/nir_opcodes.h
>> >  create mode 100644 src/glsl/nir/nir_opt_algebraic.py
>> >  create mode 100644 src/glsl/nir/nir_opt_constant_folding.c
>> >  create mode 100644 src/glsl/nir/nir_opt_copy_propagate.c
>> >  create mode 100644 src/glsl/nir/nir_opt_cse.c
>> >  create mode 100644 src/glsl/nir/nir_opt_dce.c
>> >  create mode 100644 src/glsl/nir/nir_opt_global_to_local.c
>> >  create mode 100644 src/glsl/nir/nir_opt_peephole_select.c
>> >  create mode 100644 src/glsl/nir/nir_print.c
>> >  create mode 100644 src/glsl/nir/nir_remove_dead_variables.c
>> >  create mode 100644 src/glsl/nir/nir_search.c
>> >  create mode 100644 src/glsl/nir/nir_search.h
>> >  create mode 100644 src/glsl/nir/nir_split_var_copies.c
>> >  create mode 100644 src/glsl/nir/nir_to_ssa.c
>> >  create mode 100644 src/glsl/nir/nir_types.cpp
>> >  create mode 100644 src/glsl/nir/nir_types.h
>> >  create mode 100644 src/glsl/nir/nir_validate.c
>> >  create mode 100644 src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> >
>> > --
>> > 2.2.0
>> >
>> > _______________________________________________
>> > mesa-dev mailing list
>> > mesa-dev at lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev