[Mesa-dev] [PATCH RFC 00/11] glsl: add Single Static Assignment (SSA)
Connor Abbott
cwabbott0 at gmail.com
Wed Jan 22 09:16:47 PST 2014
This series enables GLSL IR support for SSA, including passes to convert
to and from SSA form. SSA is a form of the intermediate representation
of a compiler in which each variable is assigned exactly once. SSA form
makes many optimizations faster and easier to write, and enables other
more powerful optimizations. SSA is used in GCC [1] and LLVM [2] as well
as various compiler backends within Mesa itself, such as r600g-sb and
Nouveau. Adding support for SSA will allow the various optimizations
these backends perform to be implemented in one place, instead of
making each driver reinvent the wheel (as several have already done).
Additionally, all new backends would recieve these optimizations,
reducing the burden of writing a compiler backend for a new driver.
Even though no optimization passes are now implemented, I am putting out
this series to solicit feedback on the design, to make sure I don't have
to rewrite things before I go ahead and write these new passes.
There are no piglit regressions on Softpipe, except for the
spec/OpenGL 2.0/max-samplers test, which only passed before because the
compiler happened to unroll the loop; the extra copies caused by the
conversion to and from SSA stop the compiler from unrolling, meaning
that the resulting GLSL IR code contains an indirect sampler index which
glsl-to-tgsi can't handle.
Patch 01 is a fix for a bug that came up while Piglit testing this
series.
Patches 02-06 are changes to GLSL IR that are not explicitly related to
enabling SSA, but which are needed by the later patches.
Patch 07 modifies the core GLSL IR support to allow it to represent
shaders in SSA form, and modifies the printer to print phi nodes and SSA
temporaries correctly.
Patch 08 adds a function that will come in handy in patch 09, as well as
later SSA-based optimizations.
Patch 09 adds the code to convert programs to SSA form.
Patch 10 adds the code to eliminate phi nodes and SSA temporaries,
undoing what the code in Patch 09 does.
Patch 11 allows us to Piglit test the series, and will get replaced once
some actual optimization passes are in place.
Some design choices that may need to be discussed:
- ir_variables in SSA form are now owned by the instruction where they
are defined, i.e. there are no seperate ir_variable declarations. This
is different from what the compiler currently assumes and requires a lot
of rework in different areas, but I thought it was justified for a
couple of different reasons:
1. In SSA form, usually variable dereferences point to the instruction
in which the variable is written to. Although doing this would be too
much of a rewrite, making variables owned by the instruction where they
are defined provides some of the benefit of this, making some
optimizations such as Global Code Motion [3] easier to write.
2. The original reason for having each ir_variable be declared before it
is read/written to was to preserve the tree structure of the IR by
making sure each ir_variable appeared as a child only once (i.e. in its
declaration). With SSA form, where variables are now written to once, it
makes sense for each variable to be a child of the one time it is
written to.
- The conversion from SSA is currently very naive and inserts many more
copies than necessary. It appears that the current copy propagation pass
is not able to remove many of those copies, especially in loops. It
seems there are a couple different options:
1. Implement Sreedhar's full algorithm; this requires that we implement
liveness analysis in GLSL IR.
2. Improve the current copy propagation pass to eliminate the copies it
can't handle.
3. Leave it alone, and require that backends remove the copies. i965
vec4 and fs backends, for example, already have a more sophisticated
register coalescing pass that does what we need to do, so i965 should be
fine with the extra copies.
Things that are left to do:
- Fixup ir_reader, fix the existing GLSL IR tests, and add more tests
for the conversion to/from SSA.
- Add more optimizations and convert over the existing optimizations.
Some optimizations need to be converted to use SSA, while others will be
replaced by a more powerful version. For example, Global Code Motion and
Global Value Numbering (GVN-GCM) [4] will replace constant propagation,
local value numbering, and some of the loop analysis framework while
being more powerful than all of those passes.
- As mentioned in the introduction, there are various drivers which
already use SSA. These drivers are all Gallium drivers, so it would make
sense to add support for SSA to TGSI so that the code isn't converted to
SSA twice (first in GLSL IR, then in the driver). Also, this would help
new drivers like freedreno that want to use SSA optimizations in their
backend. This may be more controversial, though, and it's outside of the
current scope of this work.
This series is also available at
https://github.com/cwabbott0/mesa/tree/glsl-ir-ssa-rfc
[1] http://gcc.gnu.org/onlinedocs/gccint/SSA.html
[2] http://llvm.org/docs/LangRef.html
[3] https://courses.cs.washington.edu/courses/cse501/04wi/papers/click-pldi95.pdf
Connor Abbott (11):
glsl: fix handling of quadop_vector constant expression
glsl: add as_loop_jump() method to ir_instruction
glsl: add a foreach_list_reverse macro
glsl: add dead branch analysis
glsl: add loop jump visitor
glsl: add swizzle_component() to ir_builder
glsl: add SSA infrastructure
glsl: add ssa_assign() to ir_builder
glsl: add pass to convert GLSL IR to SSA form
glsl: add a pass to convert out of SSA form
glsl: convert to and from SSA form in the compiler
src/glsl/Makefile.sources | 4 +
src/glsl/glsl_parser_extras.cpp | 4 +
src/glsl/ir.cpp | 56 ++
src/glsl/ir.h | 202 ++++-
src/glsl/ir_builder.cpp | 20 +
src/glsl/ir_builder.h | 2 +
src/glsl/ir_clone.cpp | 147 ++-
src/glsl/ir_constant_expression.cpp | 3 +
src/glsl/ir_dead_branches.cpp | 226 +++++
src/glsl/ir_dead_branches.h | 78 ++
src/glsl/ir_hierarchical_visitor.cpp | 36 +
src/glsl/ir_hierarchical_visitor.h | 11 +
src/glsl/ir_hv_accept.cpp | 55 +-
src/glsl/ir_loop_jumps.cpp | 129 +++
src/glsl/ir_loop_jumps.h | 71 ++
src/glsl/ir_optimization.h | 3 +
src/glsl/ir_print_visitor.cpp | 196 +++-
src/glsl/ir_print_visitor.h | 15 +
src/glsl/ir_validate.cpp | 158 +++-
src/glsl/ir_visitor.h | 8 +
src/glsl/list.h | 5 +
src/glsl/opt_from_ssa.cpp | 198 ++++
src/glsl/opt_to_ssa.cpp | 1155 ++++++++++++++++++++++++
src/mesa/drivers/dri/i965/brw_fs.h | 4 +
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 28 +
src/mesa/drivers/dri/i965/brw_vec4.h | 4 +
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 24 +
src/mesa/program/ir_to_mesa.cpp | 28 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29 +
29 files changed, 2860 insertions(+), 39 deletions(-)
create mode 100644 src/glsl/ir_dead_branches.cpp
create mode 100644 src/glsl/ir_dead_branches.h
create mode 100644 src/glsl/ir_loop_jumps.cpp
create mode 100644 src/glsl/ir_loop_jumps.h
create mode 100644 src/glsl/opt_from_ssa.cpp
create mode 100644 src/glsl/opt_to_ssa.cpp
--
1.8.3.1
More information about the mesa-dev
mailing list