[Mesa-dev] Clamp/saturate optimizations v3

Abdiel Janulgue abdiel.janulgue at linux.intel.com
Mon Aug 18 05:17:41 PDT 2014


v3 of clamp and saturate optimizations

Changes since v1: 
 - Only remove the old try_emit_saturate operations after the new optimizations are
   in place. (Matt, Ian)
 - Output [min/max](saturate(x),b) instead of saturate([min/max](x,b)) as suggested
   by Ilia Mirkin.
 - The change above required some refactoring in the fs/vec4 backend to allow
   propagation of certain instructions with saturate flag to SEL. For other instructions,
   we don't propagate saturate instructions, similar to the previous behaviour.
Since v2:
 - Fix comments to reflect we are doing a commutative operation, add missing conditions
   when optimizing clamp in opt_algebraic pass.
 - Refactor try_emit_saturate() in i965/fs instead of completely removing it. This fixed a
   a regression where the changes emitted an (extra) unnecessary saturated mov when the 
   expression generating src can do saturate directly instead.
 - Fix regression in the i965/vec4 copy-propagate optimization caused by ignoring 
   channels in the propagated instruction.
 - Count generated loops from the fs/vec4 generator.

Results from our shader-db:

total instructions in shared programs: 4538627 -> 4560104 (0.47%)
instructions in affected programs:     45144 -> 66621 (47.57%)
total loops in shared programs:        887 -> 711 (-19.84%)
GAINED:                                0
LOST:                                  36

I modified shader-db a bit to catch loops unrolls. The shaders that show increase in
instruction count are all due to the loop unroll pass triggered by this optimization
on games that contain looped clamp/saturate operation. The unroll pass also
resulted in a few shaders with looped clamp/sat skipping SIMD16 generation.

** No piglit regressions observed **

Abdiel Janulgue (17):
      i965/vec4/fs: Count loops in shader debug
      glsl: Add ir_unop_saturate
      glsl: Add constant evaluation of ir_unop_saturate
      glsl: Add a pass to lower ir_unop_saturate to clamp(x, 0, 1)
      ir_to_mesa, glsl_to_tgsi: lower ir_unop_saturate
      ir_to_mesa, glsl_to_tgsi: Add support for ir_unop_saturate
      i965/fs: Add support for ir_unop_saturate
      i965/vec4: Add support for ir_unop_saturate
      glsl: Implement saturate as ir_unop_saturate
      glsl: Optimize clamp(x, 0, 1) as saturate(x)
      glsl: Optimize clamp(x, 0.0, b), where b < 1.0 as min(saturate(x),b)
      glsl: Optimize clamp(x, b, 1.0), where b > 0.0 as max(saturate(x),b)
      i965/fs: Allow propagation of instructions with saturate flag to sel
      i965/vec4: Allow propagation of instructions with saturate flag to sel
      ir_to_mesa, glsl_to_tgsi: Remove try_emit_saturate
      i965/fs: Refactor try_emit_saturate
      i965/vec4: Remove try_emit_saturate

 src/glsl/ir.cpp                                          |  2 +
 src/glsl/ir.h                                            |  1 +
 src/glsl/ir_builder.cpp                                  |  6 +-
 src/glsl/ir_constant_expression.cpp                      |  6 ++
 src/glsl/ir_optimization.h                               |  1 +
 src/glsl/ir_validate.cpp                                 |  1 +
 src/glsl/lower_instructions.cpp                          | 29 ++++++++
 src/glsl/opt_algebraic.cpp                               | 98 ++++++++++++++++++++++++++
 src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp |  1 +
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp    | 18 ++++-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp           |  6 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp             | 27 ++++---
 src/mesa/drivers/dri/i965/brw_vec4.h                     |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp  | 85 +++++++++++++++-------
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp         |  6 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp           | 25 ++-----
 src/mesa/program/ir_to_mesa.cpp                          | 59 +++-------------
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp               | 63 +++--------------
 18 files changed, 261 insertions(+), 175 deletions(-)



More information about the mesa-dev mailing list