[Mesa-dev] [PATCH 11/12] nir: Add partial redundancy elimination for compares
Caio Marcelo de Oliveira Filho
caio.oliveira at intel.com
Fri Jun 29 00:19:57 UTC 2018
Hi,
On Wed, Jun 27, 2018 at 09:46:24PM -0700, Ian Romanick wrote:
> From: Ian Romanick <ian.d.romanick at intel.com>
>
> This pass attempts to dectect code sequences like
>
> if (x < y) {
> z = y - z;
Typo "z = x - y".
> Currently only floating point compares and adds are supported. Adding
> support for integer will be a challenge due to integer overflow. There
> are a couple possible solutions, but they may not apply to all
> architectures.
Optional: consider mentioning this in the initial comment block.
> diff --git a/src/compiler/nir/nir_instr_set.c b/src/compiler/nir/nir_instr_set.c
> index 1a491f46ff4..e9371af230a 100644
> --- a/src/compiler/nir/nir_instr_set.c
> +++ b/src/compiler/nir/nir_instr_set.c
> @@ -239,7 +239,8 @@ get_neg_instr(const nir_src *s)
> {
> const struct nir_alu_instr *const alu = nir_src_as_alu_instr_const(s);
>
> - return alu->op == nir_op_fneg || alu->op == nir_op_ineg ? alu : NULL;
> + return alu != NULL && (alu->op == nir_op_fneg || alu->op == nir_op_ineg)
> + ? alu : NULL;
> }
Squash this chunk into the previous patch that adds this function.
> +static struct block_instructions *
> +push_block(struct block_queue *bq)
> +{
> + struct block_instructions *bi =
> + (struct block_instructions *) exec_list_pop_head(&bq->reusable_blocks);
> +
> + if (bi == NULL) {
> + bi = calloc(1, sizeof(struct block_instructions));
> +
> + if (bi == NULL)
> + return NULL;
Callsites use bi->instructions without checking, so is this check here
useful?
> +static void
> +rewrite_compare_instruction(nir_builder *bld, nir_alu_instr *orig_cmp,
> + nir_alu_instr *orig_add, bool zero_on_left)
> +{
> + void *const mem_ctx = ralloc_parent(orig_cmp);
> +
> + bld->cursor = nir_before_instr(&orig_cmp->instr);
> +
> + /* This is somewhat tricky. The compare instruction may be something like
> + * (fcmp, a, b) while the add instruction is something like (fadd, fneg(a),
> + * b). This is problematic because the SSA value for the fneg(a) may not
> + * exist yet at the compare instruction.
> + *
> + * We fabricate the operands of the new add. This is done using
> + * information provided by zero_on_left. If zero_on_left is true, we know
> + * the resulting compare instruction is (fcmp, 0.0, (fadd, x, y)). If the
> + * original compare instruction was (fcmp, a, b), x = b and y = -a. If
> + * zero_on_left is false, the resulting compare instruction is (fcmp,
> + * (fadd, x, y), 0.0) and x = a and y = -b.
> + */
> + nir_ssa_def *const a = nir_ssa_for_alu_src(bld, orig_cmp, 0);
> + nir_ssa_def *const b = nir_ssa_for_alu_src(bld, orig_cmp, 1);
> +
> + nir_ssa_def *const fadd = zero_on_left
> + ? nir_fadd(bld, b, nir_fneg(bld, a))
> + : nir_fadd(bld, a, nir_fneg(bld, b));
> +
> + nir_ssa_def *const zero =
> + nir_imm_floatN_t(bld, 0.0, orig_add->dest.dest.ssa.bit_size);
> +
> + nir_ssa_def *const cmp = zero_on_left
> + ? nir_build_alu(bld, orig_cmp->op, zero, fadd, NULL, NULL)
> + : nir_build_alu(bld, orig_cmp->op, fadd, zero, NULL, NULL);
> +
> + /* Generating extra moves of the results is the easy way to make sure the
> + * writemasks match the original instructions. Later optimization passes
> + * will clean these up.
> + */
Why it isn't sufficient to set the write_mask for the instructions
that were just created?
> + /* The operands of both instructions are, with some liberty,
> + * commutative. Check all three permutations. The third
> + * permutaiton is a negation of the original operation, so it
Typo "permutation".
> + } else if (nir_alu_srcs_equal(cmp, alu, 1, 0) &&
> + nir_alu_srcs_negative_equal(cmp, alu, 0, 1)) {
> + /* This is the case where (A cmp B) matches (B + -A).
> + *
> + * A cmp B <=> 0 cmp B + -A
> + */
Shouldn't (-A + B) be handled here too? (If so, we have four valid
permutations).
> + rewrite_compare_instruction(bld, cmp, alu, true);
> +
> + *a = NULL;
> + rewrote_compare = true;
> + break;
> + }
> + }
> +
> + /* Bail after a compare in the most dominating block is found.
> + * This is necessary because 'alu' has been remove from the
Typo "removed".
Thanks,
Caio
More information about the mesa-dev
mailing list