[Mesa-dev] [PATCH 000/123] Reintroducing NIR, a new IR for mesa
Jason Ekstrand
jason at jlekstrand.net
Tue Dec 16 15:01:02 PST 2014
On Tue, Dec 16, 2014 at 2:52 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>
> Hi,
>
> On Tue, Dec 16, 2014 at 1:04 AM, Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> > NIR (pronounced "ner") is a new IR (internal representation) for the Mesa
> > shader compiler that will sit between the old IR (GLSL IR) and back-end
> > compilers. The primary purpose of NIR is to be more efficient for doing
> > optimizations and generate better code for the back-ends. We have a lot
> of
> > optimizations implemented in GLSL IR right now. However, they still
> > generate fairly bad code primarily because its tree-based structure makes
> > writing good optimizations difficult. For this reason, we have
> implemented
> > a lot of optimizations in the i965 back-end compilers just to fix up the
> > code we get from GLSL IR. The "proper fix" to this is to implement a
> > better high-level IR; enter NIR.
> >
> > Most of the initial work on NIR including setting up common data
> > structures, helper methods, and a few basic passes was by Connor Abbot
> who
> > interned with us over the summer. Connor did a fantastic job, but there
> is
> > still a lot left to be done. I've spent the last two months trying to
> fill
> > in the pieces that we need in order to get NIR off the ground. At this
> > point, we now have compitent in and out of SSA passes, are at zero piglit
> > regressions for i965 SIMD8 fragment shaders, and the shader-db numbers
> > aren't terrible.
> >
> > This is still a bit experimental. I have been testing only on HSW but it
> > should work ok on SNB and later. Eventually, once we get booleans fixed
> > up, it should work fine on older chips as well. It also doesn't yet
> > support SIMD16, so performance won't be that great. That said, I think
> we
> > are at the point now where we should try and land this and I can stop
> > developing in my masive private branch. Since this isn't quite ready for
> > prime-time yet, using it requires setting the INTEL_USE_NIR environment
> > variable.
> >
> > A few key points about NIR:
> >
> > 1. It is primarily an SSA-based IR.
> > 2. It supports source/destination-modifiers and swizzles/*write-masks.
> > 3. Standard GPU operations such as sin() and fmad() are first-class ALU
> > operations, not intrinsics.
> > 4. GLSL concepts like inputs, outputs, uniforms, etc. are built into the
> > IR so we can do proper analysis on them.
> > 5. Even though it's SSA, it still has a concept of registers and
> > write-masks in the core IR data structures. This means we can
> generate
> > code that is much closer to what backends want.
> > 6. Control flow is structured explicitly in the IR.
> >
> > (*write-masks are not available for SSA values)
> >
> > While source/destination modifiers and writemasks/swizzles are not
> > particularly useful for optimizations, having them represented in the IR
> > gives us the ability to generate more useful code for backends.
> >
> > A few notes about review:
> >
> > 1. For those of you who aren't interested in the general compiler, I'm
> > sorry for the patch-bomb. However, several people have requsted that
> > we maintain the history of the NIR development since connor's
> original
> > drop at the end of the summer. Therefore, while I've squashed
> several
> > things, I've tried to leave the diff of what I've done more-or-less
> > preserved.
> >
> > 2. No, this is not LLVM. There was a long-winded discussion about that
> > when Connor dropped his patches that went a whole lot of nowhere as
> > usual. I would really prefer if we left that debate alone. If there
> > must be bikeshedding on the topic, please do so on the cover-letter
> > e-mail.
> >
> > 3. Please keep all bikeshedding about C++, typedefs, etc. on the core
> > datastructures e-mail. If we need, we can split that off in its own
> > thread.
> >
> > 4. While I welcome review, I don't plan to make non-trivial changes to
> > specific patches or squash anything beyond what has already been
> > squashed. I've tried thus far to more-or-less keep the history and
> I'd
> > like to continue this if we can.
>
> I know you've said this, but I think there might still be some benefit
> from re-arranging a few things. In particular, I think patches 21, 36,
> 39, 59, and 65 should probably get put first so that we can push them
> + patch 1 right away (with appropriate review), since they're not
> NIR-specific. I've reviewed the ones I feel qualified to review (and
> that I didn't write!) to help with this. I know I got feedback on a
> few of those prep patches that we should wait to commit them until the
> things they introduce have users, but I think that since there are now
> patches in the list and we want to land them soon-ish it might be a
> good idea to commit them earlier in order to reduce the size of this
> patch-bomb :) Feel free to disagree, though...
>
I'm totally OK with cherry-picking and pushing those early. What I don't
want is a bunch of "patch 34 and 76 should get squashed except for this one
hunk which should go in 52". Unless, of course, I really did make a
nonsense rebasing error. Splitting patches would probably be ok if needed
though.
Connor
>
> >
> > 5. Eric Anholt has also written NIR -> TGSI -> NIR passes which will
> > hopefully get landed soon after NIR initially lands. Exactly how
> that
> > all gets hooked up for other gallium drivers beyond vc4 is outside
> the
> > scope of this series.
> >
> > I have pushed a branch to my personal freedesktop.org account. For
> certain
> > types of review, it may be easier to look at the end result rather than
> the
> > patches. The branch can be found via freedesktop cgit here:
> >
> > http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/nir-v1
> >
> > Last week, I did a presentation for some of the other Intel people to try
> > and help bring them up to speed on NIR concepts quickly. As part of
> this,
> > I typed up a bunch of notes that provide a decent overview of a lot of
> NIR
> > concepts. Those notes can be found here:
> >
> > http://www.jlekstrand.net/jason/projects/mesa/nir-notes/
> >
> > Happy reviewing!
> >
> > P.S. Connor, Don't do too much reviewing before your finals are done. :-P
> >
> > Connor Abbott (22):
> > exec_list: add a list_foreach_typed_reverse() macro
> > nir: add initial README
> > nir: add a simple C wrapper around glsl_types.h
> > nir: add the core datastructures
> > nir: add core helper functions
> > nir: add a printer
> > nir: add a validation pass
> > nir: add a glsl-to-nir pass
> > nir: add a pass to lower variables for scalar backends
> > nir: keep track of the number of input, output, and uniform slots
> > nir: add a pass to remove unused variables
> > nir: add a pass to lower sampler instructions
> > nir: add a pass to lower system value reads
> > nir: add a pass to lower atomics
> > nir: add an optimization to turn global registers into local registers
> > nir: calculate dominance information
> > nir: add a pass to convert to SSA
> > nir: add an SSA-based copy propagation pass
> > nir: add an SSA-based dead code elimination pass
> > i965/fs: make emit_fragcoord_interpolation() more general
> > i965/fs: Don't pass through the coordinate type
> > i965/fs: add a NIR frontend
> >
> > Jason Ekstrand (101):
> > i965/fs: Only use nir for 8-wide non-fast-clear shaders.
> > i965/fs_nir: Make the sampler register always unsigned
> > i965/fs_nir: Use the correct types for texture inputs
> > i965/fs_nir: Use the correct texture offset immediate
> > Fix what I think are a few NIR typos
> > Fix up varying pull constants
> > i965/fs_nir: Add support for sample_pos and sample_id
> > nir/glsl: Add support for saturate
> > nir: Add fine and coarse derivative opcodes
> > nir/glsl: Add support for coarse and fine derivatives
> > i965/fs_nir: Handle coarse/fine derivatives
> > nir/lower_atomics: Multiply array offsets by ATOMIC_COUNTER_SIZE
> > i965/fs_nir: Add atomic counters support
> > i965/fs: Allow reinterpretation in constant propagation
> > nir: Add NIR_TRUE and NIR_FALSE constants and use them for boolean
> > immediates
> > nir: Add intrinsics to do alternate interpolation on inputs
> > i965/fs: Don't take an ir_variable for emit_general_interpolation
> > i965/fs_nir: Don't duplicate emit_general_interpolation
> > nir: Add a naieve from-SSA pass
> > nir: Add a lower_vec_to_movs pass
> > i965/fs_nir: Convert the shader to/from SSA
> > nir/lower_variables_scalar: Silence a compiler warning
> > nir: Add a basic metadata management system
> > nir: Add an assert
> > nir/foreach_block: Return false if the callback on the last block
> > fails
> > nir: Add a foreach_block_reverse function
> > nir: Add a function to detect if a block is immediately followed by an
> > if
> > nir: Make the nir_index_* functions return the nuber of items
> > nir: Add an SSA-based liveness analysis pass.
> > nir: Add an initialization function for SSA definitions
> > nir: Automatically handle SSA uses when an instruction is inserted
> > nir: Add a function for rewriting all the uses of a SSA def
> > nir: Add a parallel copy instruction type
> > nir: Add a function for comparing two sources
> > nir: Add a better out-of-SSA pass
> > i965/fs_nir: Do retyping for ALU srouces in get_nir_alu_src
> > glsl/list: Fix the exec_list_validate function
> > nir: Validate all lists in the validator
> > nir/print: Don't reindex things
> > nir: Differentiate between signed and unsigned versions of find_msb
> > i965/fs_nir: Validate optimization passes
> > nir/nir: Fix a bug in move_successors
> > glsl/list: Add a foreach_list_typed_safe_reverse macro
> > nir/nir: Use safe iterators when iterating over the CFG
> > nir/nir: Patch up phi predecessors in move_successors
> > nir: Add a peephole select optimization
> > i965/fs_nir: Turn on the peephole select optimization
> > nir: Validate that the SSA def and register indices are unique
> > nir: Add a fused multiply-add peephole
> > nir: Add a basic CSE pass
> > i965/fs_nir: Add the CSE pass and actually run in a loop
> > i965/fs_nir: Use an array rather than a hash table for register lookup
> > i965/fs_nir: Handle SSA constants
> > i965/fs_nir: Properly saturate multiplies
> > nir: Add a helper for rewriting an instruction source
> > nir/lower_samplers: Use the nir_instr_rewrite_src function
> > nir: Clean up nir_deref helper functions
> > nir: Make array deref direct vs. indirect an enum
> > nir: Add a concept of a wildcard array dereference
> > nir: Use an integer index for specifying structure fields
> > nir: Don't require a function in ssa_def_init
> > nir/copy_propagate: Don't cause size mismatches on phi node sources
> > nir: Validate that the sources of a phi have the same size as the
> > destination
> > nir/glsl: Don't allocate a state_slots array for 0 state slots
> > i965/fs_nir: Don't dump the shader.
> > nir: Use the enum for the variable mode
> > nir: Automatically update SSA if uses
> > nir: Add a copy splitting pass
> > nir: Add a pass to lower local variable accesses to SSA values
> > nir: Add a pass to lower local variables to registers
> > nir: Add a pass for lowering input/output loads/stores
> > nir: Add a pass to lower global variables to local variables
> > nir/glsl: Generate SSA NIR
> > i965/fs_nir: Use the new variable lowering code
> > nir/validate: Ensure that outputs are write-only and inputs are
> > read-only
> > nir: Remove the old variable lowering code
> > nir: Vectorize intrinsics
> > nir/validate: Validate intrinsic source/destination sizes
> > nir: Add gpu_shader5 interpolation intrinsics
> > nir/glsl: Add support for gpu_shader5 interpolation instrinsics
> > nir: Add a helper for getting a constant value from an SSA source
> > i965/fs_nir: Add a has_indirect flag and clean up some of the
> > input/output code
> > i965/fs_nir: Implement the ARB_gpu_shader5 interpolation intrinsics
> > nir: Add neg, abs, and sat opcodes
> > nir: Add a lowering pass for adding source modifiers where possible
> > nir: Make the type casting operations static inline functions
> > nir/glsl: Emit abs, neg, and sat operations instead of source
> > modifiers
> > nir: Add an expression matching framework
> > nir: Add infastructure for generating algebraic transformation passes
> > nir: Add an algebraic optimization pass
> > nir: Add a basic constant folding pass
> > nir: Remove the ffma peephole
> > nir: Make texture instruction names more consistent
> > nir: Constant fold array indirects
> > nir: Use a source for uniform buffer indices instead of an index
> > nir: Add a sampler index indirect to nir_tex_instr
> > nir: Rework the way samplers are lowered
> > i965/fs_nir: Add support for indirect texture arrays
> > nir/metadata: Rename metadata_dirty to metadata_preserve
> > nir: Call nir_metadata_preserve more places
> > nir: Make bcsel a fully vector operation
> >
> > src/glsl/Makefile.am | 10 +-
> > src/glsl/Makefile.sources | 39 +-
> > src/glsl/list.h | 19 +-
> > src/glsl/nir/README | 118 ++
> > src/glsl/nir/glsl_to_nir.cpp | 1825
> +++++++++++++++++
> > src/glsl/nir/glsl_to_nir.h | 40 +
> > src/glsl/nir/nir.c | 2042
> ++++++++++++++++++++
> > src/glsl/nir/nir.h | 1433 ++++++++++++++
> > src/glsl/nir/nir_algebraic.py | 249 +++
> > src/glsl/nir/nir_dominance.c | 298 +++
> > src/glsl/nir/nir_from_ssa.c | 859 ++++++++
> > src/glsl/nir/nir_intrinsics.c | 49 +
> > src/glsl/nir/nir_intrinsics.h | 140 ++
> > src/glsl/nir/nir_live_variables.c | 282 +++
> > src/glsl/nir/nir_lower_atomics.c | 146 ++
> > src/glsl/nir/nir_lower_global_vars_to_local.c | 107 +
> > src/glsl/nir/nir_lower_io.c | 324 ++++
> > src/glsl/nir/nir_lower_locals_to_regs.c | 308 +++
> > src/glsl/nir/nir_lower_samplers.cpp | 181 ++
> > src/glsl/nir/nir_lower_system_values.c | 107 +
> > src/glsl/nir/nir_lower_to_source_mods.c | 181 ++
> > src/glsl/nir/nir_lower_variables.c | 1046 ++++++++++
> > src/glsl/nir/nir_lower_vec_to_movs.c | 96 +
> > src/glsl/nir/nir_metadata.c | 54 +
> > src/glsl/nir/nir_opcodes.c | 46 +
> > src/glsl/nir/nir_opcodes.h | 356 ++++
> > src/glsl/nir/nir_opt_algebraic.py | 67 +
> > src/glsl/nir/nir_opt_constant_folding.c | 355 ++++
> > src/glsl/nir/nir_opt_copy_propagate.c | 325 ++++
> > src/glsl/nir/nir_opt_cse.c | 269 +++
> > src/glsl/nir/nir_opt_dce.c | 186 ++
> > src/glsl/nir/nir_opt_global_to_local.c | 103 +
> > src/glsl/nir/nir_opt_peephole_select.c | 214 ++
> > src/glsl/nir/nir_print.c | 948 +++++++++
> > src/glsl/nir/nir_remove_dead_variables.c | 138 ++
> > src/glsl/nir/nir_search.c | 337 ++++
> > src/glsl/nir/nir_search.h | 80 +
> > src/glsl/nir/nir_split_var_copies.c | 225 +++
> > src/glsl/nir/nir_to_ssa.c | 660 +++++++
> > src/glsl/nir/nir_types.cpp | 143 ++
> > src/glsl/nir/nir_types.h | 75 +
> > src/glsl/nir/nir_validate.c | 912 +++++++++
> > src/mesa/drivers/dri/i965/Makefile.sources | 1 +
> > src/mesa/drivers/dri/i965/brw_fs.cpp | 74 +-
> > src/mesa/drivers/dri/i965/brw_fs.h | 57 +-
> > .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 +-
> > src/mesa/drivers/dri/i965/brw_fs_fp.cpp | 32 +-
> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 1778
> +++++++++++++++++
> > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 39 +-
> > src/mesa/main/bitset.h | 1 +
> > 50 files changed, 17301 insertions(+), 77 deletions(-)
> > create mode 100644 src/glsl/nir/README
> > create mode 100644 src/glsl/nir/glsl_to_nir.cpp
> > create mode 100644 src/glsl/nir/glsl_to_nir.h
> > create mode 100644 src/glsl/nir/nir.c
> > create mode 100644 src/glsl/nir/nir.h
> > create mode 100644 src/glsl/nir/nir_algebraic.py
> > create mode 100644 src/glsl/nir/nir_dominance.c
> > create mode 100644 src/glsl/nir/nir_from_ssa.c
> > create mode 100644 src/glsl/nir/nir_intrinsics.c
> > create mode 100644 src/glsl/nir/nir_intrinsics.h
> > create mode 100644 src/glsl/nir/nir_live_variables.c
> > create mode 100644 src/glsl/nir/nir_lower_atomics.c
> > create mode 100644 src/glsl/nir/nir_lower_global_vars_to_local.c
> > create mode 100644 src/glsl/nir/nir_lower_io.c
> > create mode 100644 src/glsl/nir/nir_lower_locals_to_regs.c
> > create mode 100644 src/glsl/nir/nir_lower_samplers.cpp
> > create mode 100644 src/glsl/nir/nir_lower_system_values.c
> > create mode 100644 src/glsl/nir/nir_lower_to_source_mods.c
> > create mode 100644 src/glsl/nir/nir_lower_variables.c
> > create mode 100644 src/glsl/nir/nir_lower_vec_to_movs.c
> > create mode 100644 src/glsl/nir/nir_metadata.c
> > create mode 100644 src/glsl/nir/nir_opcodes.c
> > create mode 100644 src/glsl/nir/nir_opcodes.h
> > create mode 100644 src/glsl/nir/nir_opt_algebraic.py
> > create mode 100644 src/glsl/nir/nir_opt_constant_folding.c
> > create mode 100644 src/glsl/nir/nir_opt_copy_propagate.c
> > create mode 100644 src/glsl/nir/nir_opt_cse.c
> > create mode 100644 src/glsl/nir/nir_opt_dce.c
> > create mode 100644 src/glsl/nir/nir_opt_global_to_local.c
> > create mode 100644 src/glsl/nir/nir_opt_peephole_select.c
> > create mode 100644 src/glsl/nir/nir_print.c
> > create mode 100644 src/glsl/nir/nir_remove_dead_variables.c
> > create mode 100644 src/glsl/nir/nir_search.c
> > create mode 100644 src/glsl/nir/nir_search.h
> > create mode 100644 src/glsl/nir/nir_split_var_copies.c
> > create mode 100644 src/glsl/nir/nir_to_ssa.c
> > create mode 100644 src/glsl/nir/nir_types.cpp
> > create mode 100644 src/glsl/nir/nir_types.h
> > create mode 100644 src/glsl/nir/nir_validate.c
> > create mode 100644 src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >
> > --
> > 2.2.0
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20141216/c64370a1/attachment-0001.html>
More information about the mesa-dev
mailing list