[Mesa-dev] [PATCH 11/12] nir: Add partial redundancy elimination for compares

Caio Marcelo de Oliveira Filho caio.oliveira at intel.com
Fri Jun 29 00:19:57 UTC 2018


Hi,

On Wed, Jun 27, 2018 at 09:46:24PM -0700, Ian Romanick wrote:
> From: Ian Romanick <ian.d.romanick at intel.com>
> 
> This pass attempts to dectect code sequences like
> 
>     if (x < y) {
>         z = y - z;

Typo "z = x - y".

 
> Currently only floating point compares and adds are supported.  Adding
> support for integer will be a challenge due to integer overflow.  There
> are a couple possible solutions, but they may not apply to all
> architectures.

Optional: consider mentioning this in the initial comment block.



> diff --git a/src/compiler/nir/nir_instr_set.c b/src/compiler/nir/nir_instr_set.c
> index 1a491f46ff4..e9371af230a 100644
> --- a/src/compiler/nir/nir_instr_set.c
> +++ b/src/compiler/nir/nir_instr_set.c
> @@ -239,7 +239,8 @@ get_neg_instr(const nir_src *s)
>  {
>     const struct nir_alu_instr *const alu = nir_src_as_alu_instr_const(s);
>  
> -   return alu->op == nir_op_fneg || alu->op == nir_op_ineg ? alu : NULL;
> +   return alu != NULL && (alu->op == nir_op_fneg || alu->op == nir_op_ineg)
> +          ? alu : NULL;
>  }

Squash this chunk into the previous patch that adds this function.


> +static struct block_instructions *
> +push_block(struct block_queue *bq)
> +{
> +   struct block_instructions *bi =
> +      (struct block_instructions *) exec_list_pop_head(&bq->reusable_blocks);
> +
> +   if (bi == NULL) {
> +      bi = calloc(1, sizeof(struct block_instructions));
> +
> +      if (bi == NULL)
> +         return NULL;

Callsites use bi->instructions without checking, so is this check here
useful?


> +static void
> +rewrite_compare_instruction(nir_builder *bld, nir_alu_instr *orig_cmp,
> +                            nir_alu_instr *orig_add, bool zero_on_left)
> +{
> +   void *const mem_ctx = ralloc_parent(orig_cmp);
> +
> +   bld->cursor = nir_before_instr(&orig_cmp->instr);
> +
> +   /* This is somewhat tricky.  The compare instruction may be something like
> +    * (fcmp, a, b) while the add instruction is something like (fadd, fneg(a),
> +    * b).  This is problematic because the SSA value for the fneg(a) may not
> +    * exist yet at the compare instruction.
> +    *
> +    * We fabricate the operands of the new add.  This is done using
> +    * information provided by zero_on_left.  If zero_on_left is true, we know
> +    * the resulting compare instruction is (fcmp, 0.0, (fadd, x, y)).  If the
> +    * original compare instruction was (fcmp, a, b), x = b and y = -a.  If
> +    * zero_on_left is false, the resulting compare instruction is (fcmp,
> +    * (fadd, x, y), 0.0) and x = a and y = -b.
> +    */
> +   nir_ssa_def *const a = nir_ssa_for_alu_src(bld, orig_cmp, 0);
> +   nir_ssa_def *const b = nir_ssa_for_alu_src(bld, orig_cmp, 1);
> +
> +   nir_ssa_def *const fadd = zero_on_left
> +      ? nir_fadd(bld, b, nir_fneg(bld, a))
> +      : nir_fadd(bld, a, nir_fneg(bld, b));
> +
> +   nir_ssa_def *const zero =
> +      nir_imm_floatN_t(bld, 0.0, orig_add->dest.dest.ssa.bit_size);
> +
> +   nir_ssa_def *const cmp = zero_on_left
> +      ? nir_build_alu(bld, orig_cmp->op, zero, fadd, NULL, NULL)
> +      : nir_build_alu(bld, orig_cmp->op, fadd, zero, NULL, NULL);
> +
> +   /* Generating extra moves of the results is the easy way to make sure the
> +    * writemasks match the original instructions.  Later optimization passes
> +    * will clean these up.
> +    */

Why it isn't sufficient to set the write_mask for the instructions
that were just created?



> +               /* The operands of both instructions are, with some liberty,
> +                * commutative.  Check all three permutations.  The third
> +                * permutaiton is a negation of the original operation, so it

Typo "permutation".


> +               } else if (nir_alu_srcs_equal(cmp, alu, 1, 0) &&
> +                          nir_alu_srcs_negative_equal(cmp, alu, 0, 1)) {
> +                  /* This is the case where (A cmp B) matches (B + -A).
> +                   *
> +                   *    A cmp B <=> 0 cmp B + -A
> +                   */

Shouldn't (-A + B) be handled here too? (If so, we have four valid
permutations).


> +                  rewrite_compare_instruction(bld, cmp, alu, true);
> +
> +                  *a = NULL;
> +                  rewrote_compare = true;
> +                  break;
> +               }
> +            }
> +
> +            /* Bail after a compare in the most dominating block is found.
> +             * This is necessary because 'alu' has been remove from the

Typo "removed".



Thanks,
Caio


More information about the mesa-dev mailing list