[Mesa-dev] [PATCH 000/123] Reintroducing NIR, a new IR for mesa
Jason Ekstrand
jason at jlekstrand.net
Mon Dec 15 22:04:10 PST 2014
NIR (pronounced "ner") is a new IR (internal representation) for the Mesa
shader compiler that will sit between the old IR (GLSL IR) and back-end
compilers. The primary purpose of NIR is to be more efficient for doing
optimizations and generate better code for the back-ends. We have a lot of
optimizations implemented in GLSL IR right now. However, they still
generate fairly bad code primarily because its tree-based structure makes
writing good optimizations difficult. For this reason, we have implemented
a lot of optimizations in the i965 back-end compilers just to fix up the
code we get from GLSL IR. The "proper fix" to this is to implement a
better high-level IR; enter NIR.
Most of the initial work on NIR including setting up common data
structures, helper methods, and a few basic passes was by Connor Abbot who
interned with us over the summer. Connor did a fantastic job, but there is
still a lot left to be done. I've spent the last two months trying to fill
in the pieces that we need in order to get NIR off the ground. At this
point, we now have compitent in and out of SSA passes, are at zero piglit
regressions for i965 SIMD8 fragment shaders, and the shader-db numbers
aren't terrible.
This is still a bit experimental. I have been testing only on HSW but it
should work ok on SNB and later. Eventually, once we get booleans fixed
up, it should work fine on older chips as well. It also doesn't yet
support SIMD16, so performance won't be that great. That said, I think we
are at the point now where we should try and land this and I can stop
developing in my masive private branch. Since this isn't quite ready for
prime-time yet, using it requires setting the INTEL_USE_NIR environment
variable.
A few key points about NIR:
1. It is primarily an SSA-based IR.
2. It supports source/destination-modifiers and swizzles/*write-masks.
3. Standard GPU operations such as sin() and fmad() are first-class ALU
operations, not intrinsics.
4. GLSL concepts like inputs, outputs, uniforms, etc. are built into the
IR so we can do proper analysis on them.
5. Even though it's SSA, it still has a concept of registers and
write-masks in the core IR data structures. This means we can generate
code that is much closer to what backends want.
6. Control flow is structured explicitly in the IR.
(*write-masks are not available for SSA values)
While source/destination modifiers and writemasks/swizzles are not
particularly useful for optimizations, having them represented in the IR
gives us the ability to generate more useful code for backends.
A few notes about review:
1. For those of you who aren't interested in the general compiler, I'm
sorry for the patch-bomb. However, several people have requsted that
we maintain the history of the NIR development since connor's original
drop at the end of the summer. Therefore, while I've squashed several
things, I've tried to leave the diff of what I've done more-or-less
preserved.
2. No, this is not LLVM. There was a long-winded discussion about that
when Connor dropped his patches that went a whole lot of nowhere as
usual. I would really prefer if we left that debate alone. If there
must be bikeshedding on the topic, please do so on the cover-letter
e-mail.
3. Please keep all bikeshedding about C++, typedefs, etc. on the core
datastructures e-mail. If we need, we can split that off in its own
thread.
4. While I welcome review, I don't plan to make non-trivial changes to
specific patches or squash anything beyond what has already been
squashed. I've tried thus far to more-or-less keep the history and I'd
like to continue this if we can.
5. Eric Anholt has also written NIR -> TGSI -> NIR passes which will
hopefully get landed soon after NIR initially lands. Exactly how that
all gets hooked up for other gallium drivers beyond vc4 is outside the
scope of this series.
I have pushed a branch to my personal freedesktop.org account. For certain
types of review, it may be easier to look at the end result rather than the
patches. The branch can be found via freedesktop cgit here:
http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/nir-v1
Last week, I did a presentation for some of the other Intel people to try
and help bring them up to speed on NIR concepts quickly. As part of this,
I typed up a bunch of notes that provide a decent overview of a lot of NIR
concepts. Those notes can be found here:
http://www.jlekstrand.net/jason/projects/mesa/nir-notes/
Happy reviewing!
P.S. Connor, Don't do too much reviewing before your finals are done. :-P
Connor Abbott (22):
exec_list: add a list_foreach_typed_reverse() macro
nir: add initial README
nir: add a simple C wrapper around glsl_types.h
nir: add the core datastructures
nir: add core helper functions
nir: add a printer
nir: add a validation pass
nir: add a glsl-to-nir pass
nir: add a pass to lower variables for scalar backends
nir: keep track of the number of input, output, and uniform slots
nir: add a pass to remove unused variables
nir: add a pass to lower sampler instructions
nir: add a pass to lower system value reads
nir: add a pass to lower atomics
nir: add an optimization to turn global registers into local registers
nir: calculate dominance information
nir: add a pass to convert to SSA
nir: add an SSA-based copy propagation pass
nir: add an SSA-based dead code elimination pass
i965/fs: make emit_fragcoord_interpolation() more general
i965/fs: Don't pass through the coordinate type
i965/fs: add a NIR frontend
Jason Ekstrand (101):
i965/fs: Only use nir for 8-wide non-fast-clear shaders.
i965/fs_nir: Make the sampler register always unsigned
i965/fs_nir: Use the correct types for texture inputs
i965/fs_nir: Use the correct texture offset immediate
Fix what I think are a few NIR typos
Fix up varying pull constants
i965/fs_nir: Add support for sample_pos and sample_id
nir/glsl: Add support for saturate
nir: Add fine and coarse derivative opcodes
nir/glsl: Add support for coarse and fine derivatives
i965/fs_nir: Handle coarse/fine derivatives
nir/lower_atomics: Multiply array offsets by ATOMIC_COUNTER_SIZE
i965/fs_nir: Add atomic counters support
i965/fs: Allow reinterpretation in constant propagation
nir: Add NIR_TRUE and NIR_FALSE constants and use them for boolean
immediates
nir: Add intrinsics to do alternate interpolation on inputs
i965/fs: Don't take an ir_variable for emit_general_interpolation
i965/fs_nir: Don't duplicate emit_general_interpolation
nir: Add a naieve from-SSA pass
nir: Add a lower_vec_to_movs pass
i965/fs_nir: Convert the shader to/from SSA
nir/lower_variables_scalar: Silence a compiler warning
nir: Add a basic metadata management system
nir: Add an assert
nir/foreach_block: Return false if the callback on the last block
fails
nir: Add a foreach_block_reverse function
nir: Add a function to detect if a block is immediately followed by an
if
nir: Make the nir_index_* functions return the nuber of items
nir: Add an SSA-based liveness analysis pass.
nir: Add an initialization function for SSA definitions
nir: Automatically handle SSA uses when an instruction is inserted
nir: Add a function for rewriting all the uses of a SSA def
nir: Add a parallel copy instruction type
nir: Add a function for comparing two sources
nir: Add a better out-of-SSA pass
i965/fs_nir: Do retyping for ALU srouces in get_nir_alu_src
glsl/list: Fix the exec_list_validate function
nir: Validate all lists in the validator
nir/print: Don't reindex things
nir: Differentiate between signed and unsigned versions of find_msb
i965/fs_nir: Validate optimization passes
nir/nir: Fix a bug in move_successors
glsl/list: Add a foreach_list_typed_safe_reverse macro
nir/nir: Use safe iterators when iterating over the CFG
nir/nir: Patch up phi predecessors in move_successors
nir: Add a peephole select optimization
i965/fs_nir: Turn on the peephole select optimization
nir: Validate that the SSA def and register indices are unique
nir: Add a fused multiply-add peephole
nir: Add a basic CSE pass
i965/fs_nir: Add the CSE pass and actually run in a loop
i965/fs_nir: Use an array rather than a hash table for register lookup
i965/fs_nir: Handle SSA constants
i965/fs_nir: Properly saturate multiplies
nir: Add a helper for rewriting an instruction source
nir/lower_samplers: Use the nir_instr_rewrite_src function
nir: Clean up nir_deref helper functions
nir: Make array deref direct vs. indirect an enum
nir: Add a concept of a wildcard array dereference
nir: Use an integer index for specifying structure fields
nir: Don't require a function in ssa_def_init
nir/copy_propagate: Don't cause size mismatches on phi node sources
nir: Validate that the sources of a phi have the same size as the
destination
nir/glsl: Don't allocate a state_slots array for 0 state slots
i965/fs_nir: Don't dump the shader.
nir: Use the enum for the variable mode
nir: Automatically update SSA if uses
nir: Add a copy splitting pass
nir: Add a pass to lower local variable accesses to SSA values
nir: Add a pass to lower local variables to registers
nir: Add a pass for lowering input/output loads/stores
nir: Add a pass to lower global variables to local variables
nir/glsl: Generate SSA NIR
i965/fs_nir: Use the new variable lowering code
nir/validate: Ensure that outputs are write-only and inputs are
read-only
nir: Remove the old variable lowering code
nir: Vectorize intrinsics
nir/validate: Validate intrinsic source/destination sizes
nir: Add gpu_shader5 interpolation intrinsics
nir/glsl: Add support for gpu_shader5 interpolation instrinsics
nir: Add a helper for getting a constant value from an SSA source
i965/fs_nir: Add a has_indirect flag and clean up some of the
input/output code
i965/fs_nir: Implement the ARB_gpu_shader5 interpolation intrinsics
nir: Add neg, abs, and sat opcodes
nir: Add a lowering pass for adding source modifiers where possible
nir: Make the type casting operations static inline functions
nir/glsl: Emit abs, neg, and sat operations instead of source
modifiers
nir: Add an expression matching framework
nir: Add infastructure for generating algebraic transformation passes
nir: Add an algebraic optimization pass
nir: Add a basic constant folding pass
nir: Remove the ffma peephole
nir: Make texture instruction names more consistent
nir: Constant fold array indirects
nir: Use a source for uniform buffer indices instead of an index
nir: Add a sampler index indirect to nir_tex_instr
nir: Rework the way samplers are lowered
i965/fs_nir: Add support for indirect texture arrays
nir/metadata: Rename metadata_dirty to metadata_preserve
nir: Call nir_metadata_preserve more places
nir: Make bcsel a fully vector operation
src/glsl/Makefile.am | 10 +-
src/glsl/Makefile.sources | 39 +-
src/glsl/list.h | 19 +-
src/glsl/nir/README | 118 ++
src/glsl/nir/glsl_to_nir.cpp | 1825 +++++++++++++++++
src/glsl/nir/glsl_to_nir.h | 40 +
src/glsl/nir/nir.c | 2042 ++++++++++++++++++++
src/glsl/nir/nir.h | 1433 ++++++++++++++
src/glsl/nir/nir_algebraic.py | 249 +++
src/glsl/nir/nir_dominance.c | 298 +++
src/glsl/nir/nir_from_ssa.c | 859 ++++++++
src/glsl/nir/nir_intrinsics.c | 49 +
src/glsl/nir/nir_intrinsics.h | 140 ++
src/glsl/nir/nir_live_variables.c | 282 +++
src/glsl/nir/nir_lower_atomics.c | 146 ++
src/glsl/nir/nir_lower_global_vars_to_local.c | 107 +
src/glsl/nir/nir_lower_io.c | 324 ++++
src/glsl/nir/nir_lower_locals_to_regs.c | 308 +++
src/glsl/nir/nir_lower_samplers.cpp | 181 ++
src/glsl/nir/nir_lower_system_values.c | 107 +
src/glsl/nir/nir_lower_to_source_mods.c | 181 ++
src/glsl/nir/nir_lower_variables.c | 1046 ++++++++++
src/glsl/nir/nir_lower_vec_to_movs.c | 96 +
src/glsl/nir/nir_metadata.c | 54 +
src/glsl/nir/nir_opcodes.c | 46 +
src/glsl/nir/nir_opcodes.h | 356 ++++
src/glsl/nir/nir_opt_algebraic.py | 67 +
src/glsl/nir/nir_opt_constant_folding.c | 355 ++++
src/glsl/nir/nir_opt_copy_propagate.c | 325 ++++
src/glsl/nir/nir_opt_cse.c | 269 +++
src/glsl/nir/nir_opt_dce.c | 186 ++
src/glsl/nir/nir_opt_global_to_local.c | 103 +
src/glsl/nir/nir_opt_peephole_select.c | 214 ++
src/glsl/nir/nir_print.c | 948 +++++++++
src/glsl/nir/nir_remove_dead_variables.c | 138 ++
src/glsl/nir/nir_search.c | 337 ++++
src/glsl/nir/nir_search.h | 80 +
src/glsl/nir/nir_split_var_copies.c | 225 +++
src/glsl/nir/nir_to_ssa.c | 660 +++++++
src/glsl/nir/nir_types.cpp | 143 ++
src/glsl/nir/nir_types.h | 75 +
src/glsl/nir/nir_validate.c | 912 +++++++++
src/mesa/drivers/dri/i965/Makefile.sources | 1 +
src/mesa/drivers/dri/i965/brw_fs.cpp | 74 +-
src/mesa/drivers/dri/i965/brw_fs.h | 57 +-
.../drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 +-
src/mesa/drivers/dri/i965/brw_fs_fp.cpp | 32 +-
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 1778 +++++++++++++++++
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 39 +-
src/mesa/main/bitset.h | 1 +
50 files changed, 17301 insertions(+), 77 deletions(-)
create mode 100644 src/glsl/nir/README
create mode 100644 src/glsl/nir/glsl_to_nir.cpp
create mode 100644 src/glsl/nir/glsl_to_nir.h
create mode 100644 src/glsl/nir/nir.c
create mode 100644 src/glsl/nir/nir.h
create mode 100644 src/glsl/nir/nir_algebraic.py
create mode 100644 src/glsl/nir/nir_dominance.c
create mode 100644 src/glsl/nir/nir_from_ssa.c
create mode 100644 src/glsl/nir/nir_intrinsics.c
create mode 100644 src/glsl/nir/nir_intrinsics.h
create mode 100644 src/glsl/nir/nir_live_variables.c
create mode 100644 src/glsl/nir/nir_lower_atomics.c
create mode 100644 src/glsl/nir/nir_lower_global_vars_to_local.c
create mode 100644 src/glsl/nir/nir_lower_io.c
create mode 100644 src/glsl/nir/nir_lower_locals_to_regs.c
create mode 100644 src/glsl/nir/nir_lower_samplers.cpp
create mode 100644 src/glsl/nir/nir_lower_system_values.c
create mode 100644 src/glsl/nir/nir_lower_to_source_mods.c
create mode 100644 src/glsl/nir/nir_lower_variables.c
create mode 100644 src/glsl/nir/nir_lower_vec_to_movs.c
create mode 100644 src/glsl/nir/nir_metadata.c
create mode 100644 src/glsl/nir/nir_opcodes.c
create mode 100644 src/glsl/nir/nir_opcodes.h
create mode 100644 src/glsl/nir/nir_opt_algebraic.py
create mode 100644 src/glsl/nir/nir_opt_constant_folding.c
create mode 100644 src/glsl/nir/nir_opt_copy_propagate.c
create mode 100644 src/glsl/nir/nir_opt_cse.c
create mode 100644 src/glsl/nir/nir_opt_dce.c
create mode 100644 src/glsl/nir/nir_opt_global_to_local.c
create mode 100644 src/glsl/nir/nir_opt_peephole_select.c
create mode 100644 src/glsl/nir/nir_print.c
create mode 100644 src/glsl/nir/nir_remove_dead_variables.c
create mode 100644 src/glsl/nir/nir_search.c
create mode 100644 src/glsl/nir/nir_search.h
create mode 100644 src/glsl/nir/nir_split_var_copies.c
create mode 100644 src/glsl/nir/nir_to_ssa.c
create mode 100644 src/glsl/nir/nir_types.cpp
create mode 100644 src/glsl/nir/nir_types.h
create mode 100644 src/glsl/nir/nir_validate.c
create mode 100644 src/mesa/drivers/dri/i965/brw_fs_nir.cpp
--
2.2.0
More information about the mesa-dev
mailing list