[Mesa-dev] [PATCH 11/23] glsl: Distribute multiply over csel with sources {-1, 0, 1}
Ian Romanick
idr at freedesktop.org
Fri Mar 20 13:58:11 PDT 2015
From: Ian Romanick <ian.d.romanick at intel.com>
If one of the multiplicands is an ir_triop_csel and the the possible
results of that csel are limited to {-1, 0, 1}, we can distribute the
multiply over the csel to eliminate the multiply:
x * mix( 0, 1, condition) => mix( 0, x, condition)
x * mix( 0, -1, condition) => mix( 0, -x, condition)
x * mix( 1, 0, condition) => mix( x, 0, condition)
x * mix( 1, -1, condition) => mix( x, -x, condition)
x * mix(-1, 0, condition) => mix(-x, 0, condition)
x * mix(-1, 1, condition) => mix(-x, x, condition)
This assumes that negation is free.
I have not yet investigate why this hurts more with NIR.
v2: Use swizzle_if_required on operands.
Shader-db results:
GM45 (0x2A42):
total instructions in shared programs: 3545604 -> 3545720 (0.00%)
instructions in affected programs: 94890 -> 95006 (0.12%)
helped: 83
HURT: 286
Iron Lake (0x0046):
total instructions in shared programs: 4975867 -> 4976076 (0.00%)
instructions in affected programs: 98079 -> 98288 (0.21%)
helped: 86
HURT: 389
Sandy Bridge (0x0116):
total instructions in shared programs: 6803299 -> 6802216 (-0.02%)
instructions in affected programs: 299775 -> 298692 (-0.36%)
helped: 1325
HURT: 233
GAINED: 3
Sandy Bridge (0x0116) NIR:
total instructions in shared programs: 6811661 -> 6817191 (0.08%)
instructions in affected programs: 422203 -> 427733 (1.31%)
helped: 182
HURT: 1939
Ivy Bridge (0x0166):
total instructions in shared programs: 6279602 -> 6278560 (-0.02%)
instructions in affected programs: 303149 -> 302107 (-0.34%)
helped: 1310
HURT: 283
Ivy Bridge (0x0166) NIR:
total instructions in shared programs: 6319127 -> 6324626 (0.09%)
instructions in affected programs: 401418 -> 406917 (1.37%)
helped: 182
HURT: 1929
Haswell (0x0426):
total instructions in shared programs: 5764382 -> 5764021 (-0.01%)
instructions in affected programs: 270160 -> 269799 (-0.13%)
helped: 1083
HURT: 510
Haswell (0x0426) NIR:
total instructions in shared programs: 5794178 -> 5800358 (0.11%)
instructions in affected programs: 359490 -> 365670 (1.72%)
helped: 182
HURT: 1929
Broadwell (0x162E):
total instructions in shared programs: 6812514 -> 6812047 (-0.01%)
instructions in affected programs: 260253 -> 259786 (-0.18%)
helped: 1134
HURT: 449
Broadwell (0x162E) NIR:
total instructions in shared programs: 7008390 -> 7014577 (0.09%)
instructions in affected programs: 358710 -> 364897 (1.72%)
helped: 182
HURT: 1946
GAINED: 12
Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
---
src/glsl/opt_algebraic.cpp | 51 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 51 insertions(+)
diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 837b080..b14f82d 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -582,6 +582,57 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
swizzle_if_required(ir, ir->operands[i ^ 1]),
ir_constant::zero(ir, ir->type));
}
+
+ /* If one of the multiplicands is an ir_triop_csel and the the possible
+ * results of that csel are limited to {-1, 0, 1}, we can distribute the
+ * multiply over the csel to eliminate the multiply:
+ *
+ * x * mix( 0, 1, condition) => mix( 0, x, condition)
+ * x * mix( 0, -1, condition) => mix( 0, -x, condition)
+ * x * mix( 1, 0, condition) => mix( x, 0, condition)
+ * x * mix( 1, -1, condition) => mix( x, -x, condition)
+ * x * mix(-1, 0, condition) => mix(-x, 0, condition)
+ * x * mix(-1, 1, condition) => mix(-x, x, condition)
+ *
+ * This assumes that negation is free.
+ */
+ for (unsigned i = 0; i < 2; i++) {
+ if (op_expr[i] == NULL || op_expr[i]->operation != ir_triop_csel)
+ continue;
+
+ ir_constant *const c[2] = {
+ op_expr[i]->operands[1]->as_constant(),
+ op_expr[i]->operands[2]->as_constant()
+ };
+
+ if (c[0] == NULL || c[1] == NULL)
+ continue;
+
+ if (!is_vec_one(c[0]) && !is_vec_zero(c[0]) &&
+ !is_vec_negative_one(c[0]))
+ continue;
+
+ if (!is_vec_one(c[1]) && !is_vec_zero(c[1]) &&
+ !is_vec_negative_one(c[1]))
+ continue;
+
+ /* We now know that the ir_triop_csel is compatible with the
+ * optimization. Assign the other multiplicand to a temporary
+ * variable and rewrite the csel.
+ */
+ ir_variable *const temp =
+ new(mem_ctx) ir_variable(ir->type,
+ "mul_over_csel",
+ ir_var_temporary);
+
+ base_ir->insert_before(temp);
+ ir_assignment *assignment = assign(temp, ir->operands[i ^ 1]);
+ base_ir->insert_before(assignment);
+
+ return csel(swizzle_if_required(ir, op_expr[i]->operands[0]),
+ swizzle_if_required(ir, handle_expression(mul(c[0], temp))),
+ swizzle_if_required(ir, handle_expression(mul(c[1], temp))));
+ }
break;
case ir_binop_div:
--
2.1.0
More information about the mesa-dev
mailing list