[Mesa-dev] [PATCH 8/9] i965/nir: Run the ffma peephole after the rest of the optimizations
Matt Turner
mattst88 at gmail.com
Mon Mar 23 20:34:30 PDT 2015
On Mon, Mar 23, 2015 at 8:13 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> The idea here is that fusing multiply-add combinations too early can reduce
> our ability to perform CSE and value-numbering. Instead, we split ffma
> opcodes up-front, hope CSE cleans up, and then fuse after-the-fact.
> Unless an algebraic pass does something silly where it inserts something
> between the multiply and the add, splitting and re-fusing should never
> cause a problem. We run the late algebraic optimizations after this so
> that things like compare-with-zero don't hurt our ability to fuse things.
>
> shader-db results for fragment shaders on Haswell:
> total instructions in shared programs: 4390538 -> 4379236 (-0.26%)
> instructions in affected programs: 989359 -> 978057 (-1.14%)
> helped: 5308
> HURT: 97
> GAINED: 78
> LOST: 5
>
> This does, unfortunately, cause some substantial hurt to a shader in Kerbal
> Space Program. However, the damage is caused by changing a single
> instruction from a ffma to an add. This, in turn, *decreases* register
> pressure in one part of the program causing it to fail to register allocate
> and spill. Given the overwhelmingly positive results in other shaders and
> the fact that the NIR for the Kerbal shaders is actually better, this
> should be considered a positive.
> ---
> src/mesa/drivers/dri/i965/brw_context.c | 1 +
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 ++++++++--
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c
> index ed6fdff..7cab1c9 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -558,6 +558,7 @@ brw_initialize_context_constants(struct brw_context *brw)
>
> static const nir_shader_compiler_options gen6_nir_options = {
> .native_integers = true,
> + .lower_ffma = true,
I see what's going on, but this probably deserves a comment.
> };
>
> /* We want the GLSL compiler to emit code that uses condition codes */
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index cdf9b01..b0b86e3 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -51,8 +51,6 @@ nir_optimize(nir_shader *nir)
> nir_validate_shader(nir);
> progress |= nir_opt_algebraic(nir);
> nir_validate_shader(nir);
> - progress |= nir_opt_peephole_ffma(nir);
> - nir_validate_shader(nir);
> progress |= nir_opt_constant_folding(nir);
> nir_validate_shader(nir);
> progress |= nir_opt_remove_phis(nir);
> @@ -131,6 +129,12 @@ fs_visitor::emit_nir_code()
>
> nir_optimize(nir);
>
> + if (brw->gen >= 6) {
> + /* Try and fuse multiply-adds */
> + nir_opt_peephole_ffma(nir);
> + nir_validate_shader(nir);
> + }
> +
> nir_opt_algebraic_late(nir);
> nir_validate_shader(nir);
>
> @@ -141,6 +145,8 @@ fs_visitor::emit_nir_code()
> nir_validate_shader(nir);
> nir_copy_prop(nir);
> nir_validate_shader(nir);
> + nir_opt_dce(nir);
> + nir_validate_shader(nir);
Just to confirm: this did help something?
>
> if (unlikely(debug_enabled)) {
> fprintf(stderr, "NIR (SSA form) for %s shader:\n", stage_name);
> --
> 2.3.3
More information about the mesa-dev
mailing list