[Mesa-dev] [PATCH 8/9] i965/nir: Run the ffma peephole after the rest of the optimizations

Matt Turner mattst88 at gmail.com
Mon Mar 23 20:34:30 PDT 2015


On Mon, Mar 23, 2015 at 8:13 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> The idea here is that fusing multiply-add combinations too early can reduce
> our ability to perform CSE and value-numbering.  Instead, we split ffma
> opcodes up-front, hope CSE cleans up, and then fuse after-the-fact.
> Unless an algebraic pass does something silly where it inserts something
> between the multiply and the add, splitting and re-fusing should never
> cause a problem.  We run the late algebraic optimizations after this so
> that things like compare-with-zero don't hurt our ability to fuse things.
>
> shader-db results for fragment shaders on Haswell:
> total instructions in shared programs: 4390538 -> 4379236 (-0.26%)
> instructions in affected programs:     989359 -> 978057 (-1.14%)
> helped:                                5308
> HURT:                                  97
> GAINED:                                78
> LOST:                                  5
>
> This does, unfortunately, cause some substantial hurt to a shader in Kerbal
> Space Program.  However, the damage is caused by changing a single
> instruction from a ffma to an add.  This, in turn, *decreases* register
> pressure in one part of the program causing it to fail to register allocate
> and spill.  Given the overwhelmingly positive results in other shaders and
> the fact that the NIR for the Kerbal shaders is actually better, this
> should be considered a positive.
> ---
>  src/mesa/drivers/dri/i965/brw_context.c  |  1 +
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 ++++++++--
>  2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c
> index ed6fdff..7cab1c9 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -558,6 +558,7 @@ brw_initialize_context_constants(struct brw_context *brw)
>
>     static const nir_shader_compiler_options gen6_nir_options = {
>        .native_integers = true,
> +      .lower_ffma = true,

I see what's going on, but this probably deserves a comment.

>     };
>
>     /* We want the GLSL compiler to emit code that uses condition codes */
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index cdf9b01..b0b86e3 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -51,8 +51,6 @@ nir_optimize(nir_shader *nir)
>        nir_validate_shader(nir);
>        progress |= nir_opt_algebraic(nir);
>        nir_validate_shader(nir);
> -      progress |= nir_opt_peephole_ffma(nir);
> -      nir_validate_shader(nir);
>        progress |= nir_opt_constant_folding(nir);
>        nir_validate_shader(nir);
>        progress |= nir_opt_remove_phis(nir);
> @@ -131,6 +129,12 @@ fs_visitor::emit_nir_code()
>
>     nir_optimize(nir);
>
> +   if (brw->gen >= 6) {
> +      /* Try and fuse multiply-adds */
> +      nir_opt_peephole_ffma(nir);
> +      nir_validate_shader(nir);
> +   }
> +
>     nir_opt_algebraic_late(nir);
>     nir_validate_shader(nir);
>
> @@ -141,6 +145,8 @@ fs_visitor::emit_nir_code()
>     nir_validate_shader(nir);
>     nir_copy_prop(nir);
>     nir_validate_shader(nir);
> +   nir_opt_dce(nir);
> +   nir_validate_shader(nir);

Just to confirm: this did help something?

>
>     if (unlikely(debug_enabled)) {
>        fprintf(stderr, "NIR (SSA form) for %s shader:\n", stage_name);
> --
> 2.3.3


More information about the mesa-dev mailing list