[Mesa-dev] [PATCH 09/23] glsl: Distribute multiply over b2f
Ian Romanick
idr at freedesktop.org
Fri Mar 20 13:58:09 PDT 2015
From: Ian Romanick <ian.d.romanick at intel.com>
Convert cases like (x * bool(b)) to 'mix(0, x, b)'.
Note: This may hurt the code generated for GPUs that represent Boolean
values using floating point. shader-db doesn't play well with i915, so
I haven't been able to check it.
On at least BDW shaders/warsow/85.shader_test is hurt by about 10%, so
that may be worth investigating.
v2: Use swizzle_if_required on opeands. Fixes
arb_texture_buffer_object-formats in debug builds. Without this,
ir_validate fails on that test.
Shader-db results:
GM45 (0x2A42):
total instructions in shared programs: 3548093 -> 3545804 (-0.06%)
instructions in affected programs: 213889 -> 211600 (-1.07%)
helped: 543
HURT: 2
Iron Lake (0x0046):
total instructions in shared programs: 4978454 -> 4975982 (-0.05%)
instructions in affected programs: 226063 -> 223591 (-1.09%)
helped: 597
HURT: 5
Sandy Bridge (0x0116):
total instructions in shared programs: 6806814 -> 6803487 (-0.05%)
instructions in affected programs: 447885 -> 444558 (-0.74%)
helped: 1612
HURT: 36
Sandy Bridge (0x0116) NIR:
total instructions in shared programs: 6813527 -> 6811992 (-0.02%)
instructions in affected programs: 329020 -> 327485 (-0.47%)
helped: 1002
HURT: 196
Ivy Bridge (0x0166):
total instructions in shared programs: 6283080 -> 6279862 (-0.05%)
instructions in affected programs: 421859 -> 418641 (-0.76%)
helped: 1592
HURT: 36
Ivy Bridge (0x0166) NIR:
total instructions in shared programs: 6319944 -> 6317148 (-0.04%)
instructions in affected programs: 303221 -> 300425 (-0.92%)
helped: 998
HURT: 176
GAINED: 4
Haswell (0x0426):
total instructions in shared programs: 5766971 -> 5764623 (-0.04%)
instructions in affected programs: 382796 -> 380448 (-0.61%)
helped: 1559
HURT: 63
Haswell (0x0426) NIR:
total instructions in shared programs: 5793258 -> 5792647 (-0.01%)
instructions in affected programs: 276929 -> 276318 (-0.22%)
helped: 837
HURT: 343
GAINED: 4
Broadwell (0x162E):
total instructions in shared programs: 6813995 -> 6811377 (-0.04%)
instructions in affected programs: 469734 -> 467116 (-0.56%)
helped: 1772
HURT: 78
LOST: 1
Broadwell (0x162E) NIR:
total instructions in shared programs: 7009761 -> 7009142 (-0.01%)
instructions in affected programs: 298433 -> 297814 (-0.21%)
helped: 866
HURT: 373
Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
---
src/glsl/opt_algebraic.cpp | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 69c03ea..837b080 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -560,6 +560,28 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
}
}
}
+
+ /* If one of the multiplicands is an ir_unop_b2f, we can convert the
+ * multiply to a simple csel.
+ *
+ * x * b2f(condition) => mix( 0, x, condition)
+ */
+ for (unsigned i = 0; i < 2; i++) {
+ if (op_expr[i] == NULL)
+ continue;
+
+ if (op_expr[i]->operation != ir_unop_b2f)
+ continue;
+
+ /* swizzle_if_required is necessary on both operands. The b2f could
+ * be a scalar (common) with the other a vector, or the b2f could be
+ * a vector with the other a scalar (as in piglit's
+ * arb_texture_buffer_object-formats test).
+ */
+ return csel(swizzle_if_required(ir, op_expr[i]->operands[0]),
+ swizzle_if_required(ir, ir->operands[i ^ 1]),
+ ir_constant::zero(ir, ir->type));
+ }
break;
case ir_binop_div:
--
2.1.0
More information about the mesa-dev
mailing list