[Mesa-dev] [PATCH 00/11] NIR Copy Propagation between blocks
Caio Marcelo de Oliveira Filho
caio.oliveira at intel.com
Sat Sep 15 05:45:22 UTC 2018
This series supersedes the "Global dead write vars removal pass".
The goal here is to perform copy propagation among values in different
blocks. While this has currently small benefits (it effectively
helped some cases with uniforms), as we move other resources to be
addressed with derefs (e.g. SSBOs), we expect it to be more useful.
In particular with compute shaders.
To be able to do this I had to extract the dead write removal from the
copy propagation pass. When performing more than per-block, the
information flows in different way for that optimization (backwards),
so it helps to keep them separated.
The pass uses an approach similar to what we do in GLSL copy prop. We
propagate values forward following the control flow graph. It doesn't
try to merge values from different branches or handle more detailed
control flow. I think this approach is a good intermediate step.
I've experimented with various approaches to implement a full
data-flow analysis, but all of them ended up either too complex or too
messy. Some factors to that were: (a) we have load/stores and copies,
so a value in ACP needs to be "broken up into pieces", (b) copies with
wildcards force us to take into consideration whether derefs are
contained or not, at many levels, (c) we have writemasks (for the
In particular (b) made the deref_map tree-based structure I've
discussed elsewhere not as good as I've expected. Because we want to
keep track of "a[*].x", "a.x" and "a[indirect].x", the walk on the
tree is not linear on the size of the deref.
A future idea I'll explore is trying to split the problem in different
pieces, directed by the inputs we see. E.g. maybe a data-flow analysis
only of the copies, or only the fully qualified load/stores, or handle
only scalars (after a vec to scalar pass).
For now, I've shelved the global optimization for dead write removal.
It wasn't helping any cases, so will wait until we have more derefs
around to see the difference.
Caio Marcelo de Oliveira Filho (11):
util: Add foreach_reverse for dynarray
util: Add macro to get number of elements in dynarray
nir: Add test file for vars related passes
nir: Add tests for dead write elimination
nir: Separate dead write removal into its own pass
intel/nir: Use the separated dead write vars pass
freedreno/ir3: Use the separated dead write vars pass
nir: Remove handling of dead writes from copy_prop_vars
nir: Add tests for copy propagation of derefs
nir: Take call instruction into account in copy_prop_vars
nir: Copy propagation between blocks
src/compiler/Makefile.nir.am | 34 +-
src/compiler/Makefile.sources | 1 +
src/compiler/nir/meson.build | 12 +
src/compiler/nir/nir.h | 2 +
src/compiler/nir/nir_opt_copy_prop_vars.c | 481 +++++++++----
src/compiler/nir/nir_opt_dead_write_vars.c | 216 ++++++
src/compiler/nir/tests/vars_tests.cpp | 737 ++++++++++++++++++++
src/gallium/drivers/freedreno/ir3/ir3_nir.c | 1 +
src/intel/compiler/brw_nir.c | 1 +
src/util/u_dynarray.h | 7 +
10 files changed, 1329 insertions(+), 163 deletions(-)
create mode 100644 src/compiler/nir/nir_opt_dead_write_vars.c
create mode 100644 src/compiler/nir/tests/vars_tests.cpp
More information about the mesa-dev