[Mesa-dev] [PATCH 15/23] glsl: Distribute binary operations over ir_triop_csel

Fri Mar 20 13:58:15 PDT 2015

From: Ian Romanick <ian.d.romanick at intel.com>

If the both result operands of the ir_triop_csel are constants and the
other operand of the binary operation is constant, distributing the
unary operation allows constant folding to eliminate it.

Shader-db results:

GM45 (0x2A42):
total instructions in shared programs: 3545351 -> 3545378 (0.00%)
instructions in affected programs:     28774 -> 28801 (0.09%)
helped:                                59
HURT:                                  65

Iron Lake (0x0046):
total instructions in shared programs: 4975577 -> 4975572 (-0.00%)
instructions in affected programs:     31837 -> 31832 (-0.02%)
helped:                                74
HURT:                                  66

Sandy Bridge (0x0116):
total instructions in shared programs: 6802556 -> 6800913 (-0.02%)
instructions in affected programs:     207412 -> 205769 (-0.79%)
helped:                                710
HURT:                                  152

Sandy Bridge (0x0116) NIR:
total instructions in shared programs: 6815965 -> 6815782 (-0.00%)
instructions in affected programs:     241297 -> 241114 (-0.08%)
helped:                                292
HURT:                                  628

Ivy Bridge (0x0166):
total instructions in shared programs: 6277594 -> 6275305 (-0.04%)
instructions in affected programs:     224874 -> 222585 (-1.02%)
helped:                                812
HURT:                                  152

Ivy Bridge (0x0166) NIR:
total instructions in shared programs: 6323362 -> 6322876 (-0.01%)
instructions in affected programs:     142728 -> 142242 (-0.34%)
helped:                                297
HURT:                                  314

Haswell (0x0426):
total instructions in shared programs: 5763055 -> 5761392 (-0.03%)
instructions in affected programs:     170608 -> 168945 (-0.97%)
helped:                                702
HURT:                                  152

Haswell (0x0426) NIR:
total instructions in shared programs: 5799128 -> 5798963 (-0.00%)
instructions in affected programs:     199397 -> 199232 (-0.08%)
helped:                                284
HURT:                                  622

Broadwell (0x162E):
total instructions in shared programs: 6811079 -> 6809460 (-0.02%)
instructions in affected programs:     171789 -> 170170 (-0.94%)
helped:                                702
HURT:                                  155

Broadwell (0x162E) NIR:
total instructions in shared programs: 7013948 -> 7013761 (-0.00%)
instructions in affected programs:     201721 -> 201534 (-0.09%)
helped:                                300
HURT:                                  625

Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
---
 src/glsl/opt_algebraic.cpp | 63 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index b1f0fa9..7d58e9b 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -354,6 +354,69 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
       }
    }
 
+   /* If the expression is a binary operation with one constant operand over
+    * an ir_triop_csel with constant results, distribute the binary operation
+    * over the ir_triop_csel results.  Constant folding will do the rest.
+    *
+    * Example: 5 * mix(0, 57, condition) becomes mix(5, 285, condition).
+    */
+   if (ir->get_num_operands() == 2) {
+      for (i = 0; i < 2; i++) {
+         if (op_expr[i] != NULL &&
+             op_const[i ^ 1] != NULL &&
+             op_expr[i]->operation == ir_triop_csel &&
+             op_expr[i]->operands[1]->as_constant() &&
+             op_expr[i]->operands[2]->as_constant()) {
+            ir_rvalue *left[2];
+            ir_rvalue *right[2];
+
+            /* Make sure the constant is in the same "position" after being
+             * distributed.  Otherwise a division could get inverted, and that
+             * would be bad.
+             */
+            if (i == 0) {
+               left[0] = op_expr[0]->operands[1];
+               left[1] = op_expr[0]->operands[2];
+               right[0] = op_const[1];
+               right[1] = op_const[1]->clone(mem_ctx, NULL);
+            } else {
+               left[0] = op_const[0];
+               left[1] = op_const[0]->clone(mem_ctx, NULL);
+               right[0] = op_expr[1]->operands[1];
+               right[1] = op_expr[1]->operands[2];
+            }
+
+            ir_expression *const tmp_expr[2] = {
+               new(mem_ctx) ir_expression(ir->operation, left[0], right[0]),
+               new(mem_ctx) ir_expression(ir->operation, left[1], right[1])
+            };
+
+            /* Constant-fold the expressions now to (possibly) save an
+             * interation through the optimization loop.
+             */
+            op_expr[i]->operands[1] = tmp_expr[0]->constant_expression_value();
+            op_expr[i]->operands[2] = tmp_expr[1]->constant_expression_value();
+            assert(op_expr[i]->operands[1] != NULL);
+            assert(op_expr[i]->operands[2] != NULL);
+
+            /* The type of the ir_triop_csel is now whatever the type of the
+             * distributed expression was.
+             */
+            assert(ir->type == op_expr[i]->operands[1]->type);
+            assert(ir->type == op_expr[i]->operands[2]->type);
+            op_expr[i]->type = ir->type;
+
+            /* Now let the rest of the algebraic optimization operate on the
+             * ir_triop_csel expression that replace the original unary
+             * expression.
+             */
+            ir = op_expr[i];
+            if (!preprocess_operands(ir, op_const, op_expr))
+                return ir;
+         }
+      }
+   }
+
    switch (ir->operation) {
    case ir_unop_bit_not:
       if (op_expr[0] && op_expr[0]->operation == ir_unop_bit_not)
-- 
2.1.0