[Mesa-dev] [PATCH 07/12] intel/compiler: More peephole select

Thu Jun 28 22:25:39 UTC 2018

Hi,

> diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
> index 67c062d91f5..6a0d4090fa7 100644
> --- a/src/intel/compiler/brw_nir.c
> +++ b/src/intel/compiler/brw_nir.c
> @@ -557,7 +557,22 @@ brw_nir_optimize(nir_shader *nir, const struct brw_compiler *compiler,
>        OPT(nir_copy_prop);
>        OPT(nir_opt_dce);
>        OPT(nir_opt_cse);
> +
> +      /* Passing 0 to the peephole select pass causes it to convert
> +       * if-statements that contain only move instructions in the branches
> +       * regardless of the count.
> +       *
> +       * Passing 0 to the peephole select pass causes it to convert

Typo "Passing 1".

> +       * if-statements that contain at most a single ALU instruction (total)
> +       * in both branches.  Before Gen6, some math instructions were
> +       * prohibitively expensive and the results of compare operations need an
> +       * extra resolve step.  For these reasons, this pass is more harmful
> +       * than good on those platforms.
> +       */
>        OPT(nir_opt_peephole_select, 0);
> +      if (compiler->devinfo->gen >= 6)
> +         OPT(nir_opt_peephole_select, 1);

It is not clear to me why running the pass twice (with 0 and then 1)
instead of using gen >= 6 to select either 0 or 1; or running both
passes with 1 if gen >= 6 (since 1 covers 0).

I do understand the second execution can optimize more cases since
blocks get simplified in the first execution, but was expecting to be
sufficient to wait the next iteration of the main brw_nir_optimize
loop.

Thanks,
Caio