<p dir="ltr"><br>
On Sep 18, 2015 2:49 AM, "Eduardo Lima Mitev" <<a href="mailto:elima@igalia.com">elima@igalia.com</a>> wrote:<br>
><br>
> When both fadd and fmul instructions have at least one operand that is a<br>
> constant and it is only used once, the total number of instructions can<br>
> be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because<br>
> the constants will be progagated as immediate operands of fmul and fadd.</p>
<p dir="ltr">Right... I'm pretty sure that I've written this patch once before. At the time we didn't do this because we thought Matt's constant combining would take care of most of those issues and we could just emit the MAD. However, that pass doesn't always combine things optimally and doesn't exist for vec4.</p>
<p dir="ltr">> This patch modifies opt_peephole_ffma pass to detect this situation and<br>
> bails-out fusing fmul+fadd into ffma.<br>
><br>
> As shown in shader-db results below, it seems to help a good bunch. However,<br>
> there are some caveats:<br>
><br>
> * It seems i965 specific, so I'm not sure if modifying the NIR pass<br>
> directly is desired, as opposed to moving this to the backend.<br>
><br>
> * There are still a high number of HURTs, but these could be reduced by being<br>
> more specific in the conditions to bailout.<br>
><br>
> total instructions in shared programs: 1683959 -> 1677447 (-0.39%)<br>
> instructions in affected programs: 604918 -> 598406 (-1.08%)<br>
> helped: 4633<br>
> HURT: 804<br>
> GAINED: 0<br>
> LOST: 0</p>
<p dir="ltr">What does that look like if you split that between vec4 and fs?</p>
<p dir="ltr">> ---<br>
> src/glsl/nir/nir_opt_peephole_ffma.c | 31 +++++++++++++++++++++++++++++++<br>
> 1 file changed, 31 insertions(+)<br>
><br>
> diff --git a/src/glsl/nir/nir_opt_peephole_ffma.c b/src/glsl/nir/nir_opt_peephole_ffma.c<br>
> index 4f0f0da..da47f8f 100644<br>
> --- a/src/glsl/nir/nir_opt_peephole_ffma.c<br>
> +++ b/src/glsl/nir/nir_opt_peephole_ffma.c<br>
> @@ -133,6 +133,28 @@ get_mul_for_src(nir_alu_src *src, int num_components,<br>
> return alu;<br>
> }<br>
><br>
> +/**<br>
> + * Given a list of (at least two) nir_alu_src's, tells if any of them is a<br>
> + * constant value and is used only once.<br>
> + */<br>
> +static bool<br>
> +any_alu_src_is_a_constant(nir_alu_src srcs[])<br>
> +{<br>
> + for (unsigned i = 0; i < 2; i++) {<br>
> + if (srcs[i].src.ssa->parent_instr->type == nir_instr_type_load_const) {<br>
> + nir_load_const_instr *load_const =<br>
> + nir_instr_as_load_const (srcs[i].src.ssa->parent_instr);<br>
> +<br>
> + if (list_length(&load_const->def.uses) == 1 &&<br>
> + list_length(&load_const->def.if_uses) == 0) {<br>
> + return true;<br>
> + }<br>
> + }<br>
> + }<br>
> +<br>
> + return false;<br>
> +}<br>
> +<br>
> static bool<br>
> nir_opt_peephole_ffma_block(nir_block *block, void *void_state)<br>
> {<br>
> @@ -183,6 +205,15 @@ nir_opt_peephole_ffma_block(nir_block *block, void *void_state)<br>
> mul_src[0] = mul->src[0].src.ssa;<br>
> mul_src[1] = mul->src[1].src.ssa;<br>
><br>
> + /* If any of the operands of the fmul and any of the fadd is a constant,<br>
> + * we bypass because it will be more efficient as the constants will be<br>
> + * propagated as operands, potentially saving two load_const instructions.<br>
> + */<br>
> + if (any_alu_src_is_a_constant(mul->src) &&<br>
> + any_alu_src_is_a_constant(add->src)) {<br>
> + continue;<br>
> + }<br>
> +<br>
> if (abs) {<br>
> for (unsigned i = 0; i < 2; i++) {<br>
> nir_alu_instr *abs = nir_alu_instr_create(state->mem_ctx,<br>
> --<br>
> 2.4.6<br>
><br>
> _______________________________________________<br>
> mesa-dev mailing list<br>
> <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
> <a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">http://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</p>