[Mesa-dev] [PATCH 3/3] intel/compiler: add an optimization pass for booleans
Caio Marcelo de Oliveira Filho
caio.oliveira at intel.com
Thu Jul 5 22:47:43 UTC 2018
(I had to stop reading to go home last Tuesday, so here are the
remaining comments.)
On Tue, May 15, 2018 at 01:05:21PM +0200, Iago Toral Quiroga wrote:
> NIR assumes that all booleans are 32-bit but Intel hardware produces
> booleans of the same size as the operands to the CMP instruction, so we
> can actually have 8-bit and 16-bit booleans. To work around this
> mismatch between NIR and the hardware, we emit boolean conversions to
> 32-bit right after emitting the CMP instruction during the NIR->FS
> pass, which makes interfacing with NIR a lot easier, but can leave
> unnecessary boolean conversions in the shader code.
Question: have you explored handling this at the NIR->FS conversion?
I.e. instead of generate the cmp + mov and then look for the cmp + mov
to fix up; when generating a cmp, perform those checks (at nir level)
and then pick the right bitsize.
> +/**
> + * Propagates the bit-size of the destination of a boolean instruction to
> + * all its consumers. If propagate_from_source is True, then the producer
> + * is a conversion MOV from a low bit-size boolean to 32-bit, and in that
> + * case the propagation happens from the source of the instruction instead
> + * of its destination.
> + */
> +static bool
> +propagate_bool_bit_size(fs_inst *inst, bool propagate_from_source)
> +{
> + assert(!propagate_from_source || inst->opcode == BRW_OPCODE_MOV);
> +
> + bool progress = false;
> +
> + const unsigned bit_size = 8 * (propagate_from_source ?
> + type_sz(inst->src[0].type) : type_sz(inst->dst.type));
> +
> + /* Look for any follow-up instructions that sources from the boolean
> + * result of the producer instruction and rewrite them to use the correct
> + * bit-size.
> + */
> + foreach_inst_in_block_starting_from(fs_inst, fixup_inst, inst) {
> + if (!inst_supports_boolean(fixup_inst))
> + continue;
Should we care about other instruction clobbering the contents of
inst->dst, or at this point of the optimization we can count on it not
being?
> + /* If it is a plain boolean conversion to 32-bit, then look for any
> + * follow-up instructions that source from the 32-bit boolean and
> + * rewrite them to source from the output of the CMP (which is the
> + * source of the conversion instruction) directly if possible.
> + */
> + progress = propagate_bool_bit_size(conv_inst, true) || progress;
> + }
> +#if 0
> + else if (inst_supports_boolean(inst) && inst->sources > 1) {
If you end up enabling this section, I suggest move the
inst_supports_boolean() check to the beginning of the for-loop, as an
early return. Makes the condition for the cases we are handling
cleaner.
> + /* For all logical instructions that can take more than one operand
> + * we need to ensure that all of them have matching bit-sizes. If they
> + * don't, it means that the original shader code is operating boolean
> + * expressions with different native bit-sizes and we need to choose
> + * a canonical boolean form for all the operands, which requires to
> + * inject conversions to temporaries. We choose the bit-size of the
> + * destination as the canonical form (which must be a 32-bit boolean
> + * since we only propagate smaller bit-sizes to the destination if we
> + * managed to convert all the operands to the same bit-size) because
> + * that way we don't need to worry about propagating the destination
> + * bit-size down the line.
> + */
To make this comment less heavy, I'd move the assumption about the
destination being 32-bit right above the assert, which is kind of an
explanation of the assertion.
> @@ -6240,6 +6471,7 @@ fs_visitor::optimize()
>
> OPT(opt_drop_redundant_mov_to_flags);
> OPT(remove_extra_rounding_modes);
> + OPT(opt_bool_bit_size);
It could be useful to have a short comment here about the importance
of calling opt_bool_bit_size at this point.
Thanks,
Caio
More information about the mesa-dev
mailing list