[Mesa-dev] [PATCH 11/12] nir: Add partial redundancy elimination for compares

Ian Romanick idr at freedesktop.org
Tue Jul 3 00:57:53 UTC 2018


On 06/28/2018 05:19 PM, Caio Marcelo de Oliveira Filho wrote:
> Hi,
> 
> On Wed, Jun 27, 2018 at 09:46:24PM -0700, Ian Romanick wrote:
>> From: Ian Romanick <ian.d.romanick at intel.com>
>>
>> This pass attempts to dectect code sequences like
>>
>>     if (x < y) {
>>         z = y - z;
> 
> Typo "z = x - y".
> 
>  
>> Currently only floating point compares and adds are supported.  Adding
>> support for integer will be a challenge due to integer overflow.  There
>> are a couple possible solutions, but they may not apply to all
>> architectures.
> 
> Optional: consider mentioning this in the initial comment block.
> 
> 
> 
>> diff --git a/src/compiler/nir/nir_instr_set.c b/src/compiler/nir/nir_instr_set.c
>> index 1a491f46ff4..e9371af230a 100644
>> --- a/src/compiler/nir/nir_instr_set.c
>> +++ b/src/compiler/nir/nir_instr_set.c
>> @@ -239,7 +239,8 @@ get_neg_instr(const nir_src *s)
>>  {
>>     const struct nir_alu_instr *const alu = nir_src_as_alu_instr_const(s);
>>  
>> -   return alu->op == nir_op_fneg || alu->op == nir_op_ineg ? alu : NULL;
>> +   return alu != NULL && (alu->op == nir_op_fneg || alu->op == nir_op_ineg)
>> +          ? alu : NULL;
>>  }
> 
> Squash this chunk into the previous patch that adds this function.

Yes... that was the original intention, but it looks like a squashed it
into the wrong patch.

>> +static struct block_instructions *
>> +push_block(struct block_queue *bq)
>> +{
>> +   struct block_instructions *bi =
>> +      (struct block_instructions *) exec_list_pop_head(&bq->reusable_blocks);
>> +
>> +   if (bi == NULL) {
>> +      bi = calloc(1, sizeof(struct block_instructions));
>> +
>> +      if (bi == NULL)
>> +         return NULL;
> 
> Callsites use bi->instructions without checking, so is this check here
> useful?

I think I'm going to fix the callsites.  Adding this was a bit of an
afterthought, and I forgot to go back and fix the callsites.  I don't
want to have to field a Coverity report later. ;)

>> +static void
>> +rewrite_compare_instruction(nir_builder *bld, nir_alu_instr *orig_cmp,
>> +                            nir_alu_instr *orig_add, bool zero_on_left)
>> +{
>> +   void *const mem_ctx = ralloc_parent(orig_cmp);
>> +
>> +   bld->cursor = nir_before_instr(&orig_cmp->instr);
>> +
>> +   /* This is somewhat tricky.  The compare instruction may be something like
>> +    * (fcmp, a, b) while the add instruction is something like (fadd, fneg(a),
>> +    * b).  This is problematic because the SSA value for the fneg(a) may not
>> +    * exist yet at the compare instruction.
>> +    *
>> +    * We fabricate the operands of the new add.  This is done using
>> +    * information provided by zero_on_left.  If zero_on_left is true, we know
>> +    * the resulting compare instruction is (fcmp, 0.0, (fadd, x, y)).  If the
>> +    * original compare instruction was (fcmp, a, b), x = b and y = -a.  If
>> +    * zero_on_left is false, the resulting compare instruction is (fcmp,
>> +    * (fadd, x, y), 0.0) and x = a and y = -b.
>> +    */
>> +   nir_ssa_def *const a = nir_ssa_for_alu_src(bld, orig_cmp, 0);
>> +   nir_ssa_def *const b = nir_ssa_for_alu_src(bld, orig_cmp, 1);
>> +
>> +   nir_ssa_def *const fadd = zero_on_left
>> +      ? nir_fadd(bld, b, nir_fneg(bld, a))
>> +      : nir_fadd(bld, a, nir_fneg(bld, b));
>> +
>> +   nir_ssa_def *const zero =
>> +      nir_imm_floatN_t(bld, 0.0, orig_add->dest.dest.ssa.bit_size);
>> +
>> +   nir_ssa_def *const cmp = zero_on_left
>> +      ? nir_build_alu(bld, orig_cmp->op, zero, fadd, NULL, NULL)
>> +      : nir_build_alu(bld, orig_cmp->op, fadd, zero, NULL, NULL);
>> +
>> +   /* Generating extra moves of the results is the easy way to make sure the
>> +    * writemasks match the original instructions.  Later optimization passes
>> +    * will clean these up.
>> +    */
> 
> Why it isn't sufficient to set the write_mask for the instructions
> that were just created?

Because there may be other users that use more of the result.  The
compare may only use x, but the other components may be used other
places.  I copied this idiom from nir_replace_instr in nir_search.c.  It
took me some time to re-find the source of this, so I'll add mention of
that to the comment.

>> +               /* The operands of both instructions are, with some liberty,
>> +                * commutative.  Check all three permutations.  The third
>> +                * permutaiton is a negation of the original operation, so it
> 
> Typo "permutation".

Fixed locally.

>> +               } else if (nir_alu_srcs_equal(cmp, alu, 1, 0) &&
>> +                          nir_alu_srcs_negative_equal(cmp, alu, 0, 1)) {
>> +                  /* This is the case where (A cmp B) matches (B + -A).
>> +                   *
>> +                   *    A cmp B <=> 0 cmp B + -A
>> +                   */
> 
> Shouldn't (-A + B) be handled here too? (If so, we have four valid
> permutations).

Yeah... I don't know why I didn't do that too.  It looks like on SKL 56
shaders in shader-db have this missing pattern.

>> +                  rewrite_compare_instruction(bld, cmp, alu, true);
>> +
>> +                  *a = NULL;
>> +                  rewrote_compare = true;
>> +                  break;
>> +               }
>> +            }
>> +
>> +            /* Bail after a compare in the most dominating block is found.
>> +             * This is necessary because 'alu' has been remove from the
> 
> Typo "removed".

Fixed locally.

> Thanks,
> Caio


More information about the mesa-dev mailing list