[Mesa-dev] [PATCH 11/23] glsl: Distribute multiply over csel with sources {-1, 0, 1}

Ian Romanick idr at freedesktop.org
Fri Mar 20 13:58:11 PDT 2015


From: Ian Romanick <ian.d.romanick at intel.com>

If one of the multiplicands is an ir_triop_csel and the the possible
results of that csel are limited to {-1, 0, 1}, we can distribute the
multiply over the csel to eliminate the multiply:

    x * mix( 0,  1, condition) => mix( 0,  x, condition)
    x * mix( 0, -1, condition) => mix( 0, -x, condition)
    x * mix( 1,  0, condition) => mix( x,  0, condition)
    x * mix( 1, -1, condition) => mix( x, -x, condition)
    x * mix(-1,  0, condition) => mix(-x,  0, condition)
    x * mix(-1,  1, condition) => mix(-x,  x, condition)

This assumes that negation is free.

I have not yet investigate why this hurts more with NIR.

v2: Use swizzle_if_required on operands.

Shader-db results:
GM45 (0x2A42):
total instructions in shared programs: 3545604 -> 3545720 (0.00%)
instructions in affected programs:     94890 -> 95006 (0.12%)
helped:                                83
HURT:                                  286

Iron Lake (0x0046):
total instructions in shared programs: 4975867 -> 4976076 (0.00%)
instructions in affected programs:     98079 -> 98288 (0.21%)
helped:                                86
HURT:                                  389

Sandy Bridge (0x0116):
total instructions in shared programs: 6803299 -> 6802216 (-0.02%)
instructions in affected programs:     299775 -> 298692 (-0.36%)
helped:                                1325
HURT:                                  233
GAINED:                                3

Sandy Bridge (0x0116) NIR:
total instructions in shared programs: 6811661 -> 6817191 (0.08%)
instructions in affected programs:     422203 -> 427733 (1.31%)
helped:                                182
HURT:                                  1939

Ivy Bridge (0x0166):
total instructions in shared programs: 6279602 -> 6278560 (-0.02%)
instructions in affected programs:     303149 -> 302107 (-0.34%)
helped:                                1310
HURT:                                  283

Ivy Bridge (0x0166) NIR:
total instructions in shared programs: 6319127 -> 6324626 (0.09%)
instructions in affected programs:     401418 -> 406917 (1.37%)
helped:                                182
HURT:                                  1929

Haswell (0x0426):
total instructions in shared programs: 5764382 -> 5764021 (-0.01%)
instructions in affected programs:     270160 -> 269799 (-0.13%)
helped:                                1083
HURT:                                  510

Haswell (0x0426) NIR:
total instructions in shared programs: 5794178 -> 5800358 (0.11%)
instructions in affected programs:     359490 -> 365670 (1.72%)
helped:                                182
HURT:                                  1929

Broadwell (0x162E):
total instructions in shared programs: 6812514 -> 6812047 (-0.01%)
instructions in affected programs:     260253 -> 259786 (-0.18%)
helped:                                1134
HURT:                                  449

Broadwell (0x162E) NIR:
total instructions in shared programs: 7008390 -> 7014577 (0.09%)
instructions in affected programs:     358710 -> 364897 (1.72%)
helped:                                182
HURT:                                  1946
GAINED:                                12

Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
---
 src/glsl/opt_algebraic.cpp | 51 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 837b080..b14f82d 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -582,6 +582,57 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
                      swizzle_if_required(ir, ir->operands[i ^ 1]),
                      ir_constant::zero(ir, ir->type));
       }
+
+      /* If one of the multiplicands is an ir_triop_csel and the the possible
+       * results of that csel are limited to {-1, 0, 1}, we can distribute the
+       * multiply over the csel to eliminate the multiply:
+       *
+       *     x * mix( 0,  1, condition) => mix( 0,  x, condition)
+       *     x * mix( 0, -1, condition) => mix( 0, -x, condition)
+       *     x * mix( 1,  0, condition) => mix( x,  0, condition)
+       *     x * mix( 1, -1, condition) => mix( x, -x, condition)
+       *     x * mix(-1,  0, condition) => mix(-x,  0, condition)
+       *     x * mix(-1,  1, condition) => mix(-x,  x, condition)
+       *
+       * This assumes that negation is free.
+       */
+      for (unsigned i = 0; i < 2; i++) {
+         if (op_expr[i] == NULL || op_expr[i]->operation != ir_triop_csel)
+            continue;
+
+         ir_constant *const c[2] = {
+            op_expr[i]->operands[1]->as_constant(),
+            op_expr[i]->operands[2]->as_constant()
+         };
+
+         if (c[0] == NULL || c[1] == NULL)
+            continue;
+
+         if (!is_vec_one(c[0]) && !is_vec_zero(c[0]) &&
+             !is_vec_negative_one(c[0]))
+            continue;
+
+         if (!is_vec_one(c[1]) && !is_vec_zero(c[1]) &&
+             !is_vec_negative_one(c[1]))
+            continue;
+
+         /* We now know that the ir_triop_csel is compatible with the
+          * optimization.  Assign the other multiplicand to a temporary
+          * variable and rewrite the csel.
+          */
+         ir_variable *const temp =
+            new(mem_ctx) ir_variable(ir->type,
+                                     "mul_over_csel",
+                                     ir_var_temporary);
+
+         base_ir->insert_before(temp);
+         ir_assignment *assignment = assign(temp, ir->operands[i ^ 1]);
+         base_ir->insert_before(assignment);
+
+         return csel(swizzle_if_required(ir, op_expr[i]->operands[0]),
+                     swizzle_if_required(ir, handle_expression(mul(c[0], temp))),
+                     swizzle_if_required(ir, handle_expression(mul(c[1], temp))));
+      }
       break;
 
    case ir_binop_div:
-- 
2.1.0



More information about the mesa-dev mailing list