[Mesa-dev] [PATCH 00/11] Various vec4 NIR improvements

Jason Ekstrand jason at jlekstrand.net
Wed Sep 9 17:50:03 PDT 2015


This patch series started out with me suggesting to Eduardo that his
coalescing pass would be easier if the shader was in partial SSA.  So I
started working towards getting the vec4 backend using partial SSA like the
FS backend does.  Then I started adding more stuff to lower_vec_to_movs and
the rest is history...

It's worth noting that I'm not the first one to attempt to do coalescing in
NIR.  This is heavily inspired by Eduardo Lima Mitev's patch that does
something similar.  The big differences are that a) His does it as a
separate pass and b) Mine takes advantage of partial SSA form.

Shader-db results for vec4 programs on Haswell:

   total instructions in shared programs: 1801527 -> 1751223 (-2.79%)
   instructions in affected programs:     1242113 -> 1191809 (-4.05%)
   helped:                                12078
   HURT:                                  6

Together with the RFC patches I sent out yesterday, we get the following
total delta between GLSL IR and NIR:

   total instructions in shared programs: 1853737 -> 1683930 (-9.16%)
   instructions in affected programs:     1694137 -> 1524330 (-10.02%)
   helped:                                12748
   HURT:                                  4348

Not sure if we want all these patches to land in their current form, but it
seems like our shader-db regression problem is totally solvable. :-)

Jason Ekstrand (11):
  nir: Fix a bunch of ralloc parenting errors
  nir: Remove the mem_ctx parameter from ssa_def_rewrite_uses
  nir: Only unlink sources that are actually valid
  nir: Add a function for rewriting instruction destinations
  nir/from_ssa: Use instr_rewrite_dest
  nir/lower_vec_to_movs: Add a state struct
  nir/lower_vec_to_movs: Handle partially SSA shaders
  i965/vec4_nir: Use partial SSA form rather than full non-SSA
  nir/lower_vec_to_movs: Get rid of start_idx and swizzle compacting
  nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible
  HACK: nir/lower_vec_to_movs: Coalesce into destinations of fdot
    instructions

 src/glsl/nir/nir.c                            |  61 ++++++---
 src/glsl/nir/nir.h                            |  17 +--
 src/glsl/nir/nir_control_flow.c               |   2 +-
 src/glsl/nir/nir_from_ssa.c                   |  11 +-
 src/glsl/nir/nir_lower_alu_to_scalar.c        |  12 +-
 src/glsl/nir/nir_lower_atomics.c              |   5 +-
 src/glsl/nir/nir_lower_idiv.c                 |   4 +-
 src/glsl/nir/nir_lower_io.c                   |   5 +-
 src/glsl/nir/nir_lower_load_const_to_scalar.c |   3 +-
 src/glsl/nir/nir_lower_locals_to_regs.c       |  10 +-
 src/glsl/nir/nir_lower_phis_to_scalar.c       |   3 +-
 src/glsl/nir/nir_lower_system_values.c        |   3 +-
 src/glsl/nir/nir_lower_vars_to_ssa.c          |   6 +-
 src/glsl/nir/nir_lower_vec_to_movs.c          | 173 ++++++++++++++++++++++----
 src/glsl/nir/nir_opt_constant_folding.c       |   4 +-
 src/glsl/nir/nir_opt_cse.c                    |   6 +-
 src/glsl/nir/nir_opt_dead_cf.c                |   4 +-
 src/glsl/nir/nir_opt_peephole_ffma.c          |   6 +-
 src/glsl/nir/nir_opt_peephole_select.c        |   7 +-
 src/glsl/nir/nir_opt_remove_phis.c            |   5 +-
 src/glsl/nir/nir_search.c                     |   2 +-
 src/mesa/drivers/dri/i965/brw_nir.c           |   2 +-
 src/mesa/drivers/dri/i965/brw_vec4.h          |   1 +
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp    |  21 +++-
 24 files changed, 258 insertions(+), 115 deletions(-)

-- 
2.5.0.400.gff86faf



More information about the mesa-dev mailing list