[Mesa-dev] [PATCH 2/2] i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const
Jason Ekstrand
jason at jlekstrand.net
Thu Oct 22 08:57:36 PDT 2015
On Thu, Oct 22, 2015 at 7:12 AM, Eduardo Lima Mitev <elima at igalia.com> wrote:
> When both fadd and fmul instructions have at least one operand that is a
> constant and it is only used once, the total number of instructions can
> be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because
> the constants will be progagated as immediate operands of fmul and fadd.
>
> This patch detects these situations and prevents fusing fmul+fadd into ffma.
>
> Shader-db results on i965 Haswell:
>
> total instructions in shared programs: 6240407 -> 6230467 (-0.16%)
> instructions in affected programs: 1126478 -> 1116538 (-0.88%)
> total loops in shared programs: 1979 -> 1979 (0.00%)
> helped: 7612
> HURT: 843
> GAINED: 4
> LOST: 0
> ---
> .../drivers/dri/i965/brw_nir_opt_peephole_ffma.c | 31 ++++++++++++++++++++++
> 1 file changed, 31 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c b/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
> index a8448e7..0c60528 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c
> @@ -133,6 +133,28 @@ get_mul_for_src(nir_alu_src *src, int num_components,
> return alu;
> }
>
> +/**
> + * Given a list of (at least two) nir_alu_src's, tells if any of them is a
> + * constant value and is used only once.
> + */
> +static bool
> +any_alu_src_is_a_constant(nir_alu_src srcs[])
> +{
> + for (unsigned i = 0; i < 2; i++) {
> + if (srcs[i].src.ssa->parent_instr->type == nir_instr_type_load_const) {
> + nir_load_const_instr *load_const =
> + nir_instr_as_load_const (srcs[i].src.ssa->parent_instr);
> +
> + if (list_length(&load_const->def.uses) == 1 &&
I've been meaning to add a list_is_singular helper. You can check for
the just one entry case much faster than doing a list_length. If
you'd like, go ahead and add such a helper and use it. Otherwise,
length is ok here.
> + list_length(&load_const->def.if_uses) == 0) {
list_empty would be better here
> + return true;
> + }
> + }
> + }
> +
> + return false;
> +}
> +
> static bool
> brw_nir_opt_peephole_ffma_block(nir_block *block, void *void_state)
> {
> @@ -183,6 +205,15 @@ brw_nir_opt_peephole_ffma_block(nir_block *block, void *void_state)
> mul_src[0] = mul->src[0].src.ssa;
> mul_src[1] = mul->src[1].src.ssa;
>
> + /* If any of the operands of the fmul and any of the fadd is a constant,
> + * we bypass because it will be more efficient as the constants will be
> + * propagated as operands, potentially saving two load_const instructions.
> + */
> + if (any_alu_src_is_a_constant(mul->src) &&
> + any_alu_src_is_a_constant(add->src)) {
> + continue;
> + }
> +
> if (abs) {
> for (unsigned i = 0; i < 2; i++) {
> nir_alu_instr *abs = nir_alu_instr_create(state->mem_ctx,
> --
> 2.5.3
>
More information about the mesa-dev
mailing list