[Mesa-dev] [RFC 0/7] nir, i965/fs: Lower indirect local variables to scratch
Jason Ekstrand
jason at jlekstrand.net
Mon Dec 5 19:59:51 UTC 2016
This little series implements lowering of indirectly accessed local
variables larger than some threshold (8 floats?) to scratch space. This
improves the performance of the CSDof synmark test by about 45% because it
uses a large temporary array which we lower to if-ladders and then to piles
of scratch.
The approach I've taken here is to add a new set of NIR intrinsics for
reading and writing scratch. It's treated like any other form of IO with a
new nir_lower_vars_to_scratch pass that lowers everything over a given size
threshold to scratch space. Why do this in NIR? The primary reason is
that this lets us lower to scratch *before* we do nir_lower_indirect_derefs
so we can still use registers for small indirects where an if-ladder is
more efficient than scratch space. Also, after gaving it a try, I really
liked how those intrinsics turned out.
This series is marked RFC because it's still a bit sketchy at the moment.
There are a few things that would need to be finished before it's ready for
landing:
1) I should probably run it through piglit.
2) The back-end portion doesn't yet handle doubles
3) We should use send-from-GRF for non-spill direct scratch reads/writes.
Right now, it's still using MRFs which isn't great.
If people like where this series is going, I can probably find some time to
polish it to the point of mergeable.
Jason Ekstrand (6):
nir: Add load/store_scratch intrinsics
nir: Add a pass for selectively lowering variables to scratch space
i965/fs: Add a CHANNEL_IDS opcode
i965/fs: Add DWord scattered read/write opcodes
i965/fs: Implement the new nir_scratch_load/store opcodes
i965: Lower large local arrays to scratch
Timothy Arceri (1):
i965: use nir_lower_indirect_derefs() for GLSL
src/compiler/Makefile.sources | 1 +
src/compiler/nir/nir.h | 8 +-
src/compiler/nir/nir_clone.c | 1 +
src/compiler/nir/nir_intrinsics.h | 6 +-
src/compiler/nir/nir_lower_scratch.c | 258 ++++++++++++++++++++++
src/intel/vulkan/anv_pipeline.c | 10 -
src/mesa/drivers/dri/i965/brw_defines.h | 10 +
src/mesa/drivers/dri/i965/brw_fs.cpp | 113 ++++++++++
src/mesa/drivers/dri/i965/brw_fs.h | 6 +
src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 1 +
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 170 ++++++++++++++
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 42 +++-
src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 4 +-
src/mesa/drivers/dri/i965/brw_link.cpp | 13 --
src/mesa/drivers/dri/i965/brw_nir.c | 13 ++
src/mesa/drivers/dri/i965/brw_shader.cpp | 12 +
16 files changed, 631 insertions(+), 37 deletions(-)
create mode 100644 src/compiler/nir/nir_lower_scratch.c
--
2.5.0.400.gff86faf
More information about the mesa-dev
mailing list