[Mesa-dev] [RFC] nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const

Jason Ekstrand jason at jlekstrand.net
Sat Sep 19 12:00:21 PDT 2015


On Sep 18, 2015 2:49 AM, "Eduardo Lima Mitev" <elima at igalia.com> wrote:
>
> When both fadd and fmul instructions have at least one operand that is a
> constant and it is only used once, the total number of instructions can
> be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because
> the constants will be progagated as immediate operands of fmul and fadd.

Right... I'm pretty sure that I've written this patch once before.  At the
time we didn't do this because we thought Matt's constant combining would
take care of most of those issues and we could just emit the MAD.  However,
that pass doesn't always combine things optimally and doesn't exist for
vec4.

> This patch modifies opt_peephole_ffma pass to detect this situation and
> bails-out fusing fmul+fadd into ffma.
>
> As shown in shader-db results below, it seems to help a good bunch.
However,
> there are some caveats:
>
> * It seems i965 specific, so I'm not sure if modifying the NIR pass
> directly is desired, as opposed to moving this to the backend.
>
> * There are still a high number of HURTs, but these could be reduced by
being
> more specific in the conditions to bailout.
>
> total instructions in shared programs: 1683959 -> 1677447 (-0.39%)
> instructions in affected programs:     604918 -> 598406 (-1.08%)
> helped:                                4633
> HURT:                                  804
> GAINED:                                0
> LOST:                                  0

What does that look like if you split that between vec4 and fs?

> ---
>  src/glsl/nir/nir_opt_peephole_ffma.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
>
> diff --git a/src/glsl/nir/nir_opt_peephole_ffma.c
b/src/glsl/nir/nir_opt_peephole_ffma.c
> index 4f0f0da..da47f8f 100644
> --- a/src/glsl/nir/nir_opt_peephole_ffma.c
> +++ b/src/glsl/nir/nir_opt_peephole_ffma.c
> @@ -133,6 +133,28 @@ get_mul_for_src(nir_alu_src *src, int num_components,
>     return alu;
>  }
>
> +/**
> + * Given a list of (at least two) nir_alu_src's, tells if any of them is
a
> + * constant value and is used only once.
> + */
> +static bool
> +any_alu_src_is_a_constant(nir_alu_src srcs[])
> +{
> +   for (unsigned i = 0; i < 2; i++) {
> +      if (srcs[i].src.ssa->parent_instr->type ==
nir_instr_type_load_const) {
> +         nir_load_const_instr *load_const =
> +            nir_instr_as_load_const (srcs[i].src.ssa->parent_instr);
> +
> +         if (list_length(&load_const->def.uses) == 1 &&
> +             list_length(&load_const->def.if_uses) == 0) {
> +            return true;
> +         }
> +      }
> +   }
> +
> +   return false;
> +}
> +
>  static bool
>  nir_opt_peephole_ffma_block(nir_block *block, void *void_state)
>  {
> @@ -183,6 +205,15 @@ nir_opt_peephole_ffma_block(nir_block *block, void
*void_state)
>        mul_src[0] = mul->src[0].src.ssa;
>        mul_src[1] = mul->src[1].src.ssa;
>
> +      /* If any of the operands of the fmul and any of the fadd is a
constant,
> +       * we bypass because it will be more efficient as the constants
will be
> +       * propagated as operands, potentially saving two load_const
instructions.
> +       */
> +      if (any_alu_src_is_a_constant(mul->src) &&
> +          any_alu_src_is_a_constant(add->src)) {
> +         continue;
> +      }
> +
>        if (abs) {
>           for (unsigned i = 0; i < 2; i++) {
>              nir_alu_instr *abs = nir_alu_instr_create(state->mem_ctx,
> --
> 2.4.6
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150919/04b6f23c/attachment.html>


More information about the mesa-dev mailing list