[Mesa-dev] Clamp/saturate optimizations V2
Abdiel Janulgue
abdiel.janulgue at linux.intel.com
Mon Jul 7 06:56:57 PDT 2014
V2 of clamp/saturate optimizations
This patch series add the plumbing to support the GLSL IR instruction saturate().
Previously, saturate is implemented as min/max instructions. Most GPUs, however,
can probably perform saturate for free. With these changes, we can allow saturate
to be optimized as a single instruction.
In addition, the optimization try_emit_saturate was previously used to optimize
min/max operations to a saturate operation. It didn't work for code such as
min(max(a, b), c) where (b == 0.0 and c < 1.0) and related cases.
With this infrastructure in place, we can optimize those operations easily in the
do_algebraic pass.
Changes since v1:
- Only remove the old try_emit_saturate operations after the new optimizations are
in place. (Matt, Ian)
- Output [min/max](saturate(x),b) instead of saturate([min/max](x,b)) as suggested
by Ilia Mirkin.
- The change above required some refactoring in the fs/vec4 backend to allow
propagation of certain instructions with saturate flag to SEL. For other instructions,
we don't propagate saturate instructions, similar to the previous behaviour.
Results:
helped: shaders/0ad/9.shader_test fs16: 38 -> 37 (-2.63%)
helped: shaders/0ad/9.shader_test fs8: 38 -> 37 (-2.63%)
helped: shaders/gst-gl-tunnel.frag fs16: 65 -> 64 (-1.54%)
helped: shaders/gst-gl-tunnel.frag fs8: 65 -> 64 (-1.54%)
total instructions in shared programs: 41595 -> 41591 (-0.01%)
instructions in affected programs: 206 -> 202 (-1.94%)
No piglit regressions observed.
Abdiel Janulgue (16):
glsl: Add ir_unop_saturate
glsl: Add constant evaluation of ir_unop_saturate
glsl: Add a pass to lower ir_unop_saturate to clamp(x, 0, 1)
ir_to_mesa, glsl_to_tgsi: lower ir_unop_saturate
ir_to_mesa, glsl_to_tgsi: Add support for ir_unop_saturate
i965/fs: Add support for ir_unop_saturate
i965/vec4: Add support for ir_unop_saturate
glsl: Implement saturate as ir_unop_saturate
glsl: Optimize clamp(x, 0, 1) as saturate(x)
glsl: Optimize clamp(x, 0.0, b), where b < 1.0 as min(saturate(x),b)
glsl: Optimize clamp(x, b, 1.0), where b > 0.0 as max(saturate(x),b)
i965/fs: Allow propagation of instructions with saturate flag to sel
i965/vec4: Allow propagation of instructions with saturate flag to sel
ir_to_mesa, glsl_to_tgsi: Remove try_emit_saturate
i965/fs: Remove try_emit_saturate
i965/vec4: Remove try_emit_saturate
src/glsl/ir.cpp | 2 +
src/glsl/ir.h | 1 +
src/glsl/ir_builder.cpp | 6 +-
src/glsl/ir_constant_expression.cpp | 6 ++
src/glsl/ir_optimization.h | 1 +
src/glsl/ir_validate.cpp | 1 +
src/glsl/lower_instructions.cpp | 29 +++++++++
src/glsl/opt_algebraic.cpp | 56 ++++++++++++++++++
src/mesa/drivers/dri/i965/brw_fs.h | 1 -
.../drivers/dri/i965/brw_fs_channel_expressions.cpp | 1 +
src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 18 +++++-
src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 41 ++-----------
src/mesa/drivers/dri/i965/brw_vec4.h | 1 -
src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 75 +++++++++++++++---------
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 25 ++------
src/mesa/program/ir_to_mesa.cpp | 59 ++++---------------
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 63 ++++----------------
17 files changed, 192 insertions(+), 194 deletions(-)
More information about the mesa-dev
mailing list