[Mesa-dev] [PATCH 5/5] nir: Add nir_lower_alu_scalar.
Eric Anholt
eric at anholt.net
Thu Jan 22 14:08:21 PST 2015
Jason Ekstrand <jason at jlekstrand.net> writes:
> Overall this looks correct. I've got a few nits below and I'd like to take
> a look at it with fresh eyes before giving an R-B as it's complicated
> especially with all of the stuff to handle non-ssa. Not sure if it's
> really worth doing non-ssa now that I see how much more complicated it
> makes things.
> --Jason
>
> On Wed, Jan 21, 2015 at 5:26 PM, Eric Anholt <eric at anholt.net> wrote:
>
>> This is the equivalent of brw_fs_channel_expressions.cpp, which I wanted
>> for vc4.
>> +static void
>> +reduce_op_replace(nir_alu_instr *instr, nir_ssa_def *def, void *mem_ctx)
>> +{
>> + assert(instr->dest.write_mask == 1);
>> +
>> + if (instr->dest.dest.is_ssa) {
>> + nir_src new_src;
>> + new_src.is_ssa = true;
>> + new_src.ssa = def;
>> + nir_ssa_def_rewrite_uses(&instr->dest.dest.ssa, new_src, mem_ctx);
>> + } else {
>> + nir_alu_instr *mov = nir_alu_instr_create(mem_ctx, nir_op_imov);
>> + mov->src[0].src.is_ssa = true;
>> + mov->src[0].src.ssa = def;
>> +
>> + nir_alu_dest_copy(&mov->dest, &instr->dest, mem_ctx);
>> + nir_instr_insert_after(&instr->instr, &mov->instr);
>
>
> I usually do insert_before when possible. It will result in the same list
> (since we remove the instruction we're putting this before) but, since this
> happens as we're iterating over it, insert_before is a bit safer. Maybe
> this is still safe; I'm not sure.
Done.
>> + }
>> +
>> + nir_instr_remove(&instr->instr);
>> +}
>> +
>> +static void
>> +lower_reduction(nir_alu_instr *instr, nir_op chan_op, nir_op merge_op,
>> + void *mem_ctx)
>> +{
>> + unsigned num_components = nir_op_infos[instr->op].input_sizes[0];
>> +
>> + nir_ssa_def *last = NULL;
>> + for (unsigned i = 0; i < num_components; i++) {
>> + nir_alu_instr *chan = nir_alu_instr_create(mem_ctx, chan_op);
>> + nir_alu_ssa_dest_init(chan, 1);
>> + nir_alu_src_copy(&chan->src[0], &instr->src[0], mem_ctx);
>> + chan->src[0].swizzle[0] = chan->src[0].swizzle[i];
>> + if (nir_op_infos[chan_op].num_inputs > 1) {
>>
>
> assert num_inputs == 2 here?
Done.
>> + nir_alu_src_copy(&chan->src[1], &instr->src[1], mem_ctx);
>> + chan->src[1].swizzle[0] = chan->src[1].swizzle[i];
>> + }
>> +
>> + nir_instr_insert_before(&instr->instr, &chan->instr);
>> +
>> + if (i == 0) {
>> + last = &chan->dest.dest.ssa;
>> + } else {
>> + nir_alu_instr *merge = nir_alu_instr_create(mem_ctx, merge_op);
>> + nir_alu_ssa_dest_init(merge, 1);
>> + merge->dest.write_mask = 1;
>> + merge->src[0].src.is_ssa = true;
>> + merge->src[0].src.ssa = last;
>> + merge->src[1].src.is_ssa = true;
>> + merge->src[1].src.ssa = &chan->dest.dest.ssa;
>> + nir_instr_insert_before(&instr->instr, &merge->instr);
>> + last = &merge->dest.dest.ssa;
>>
>
> It might be nice if the tree were better balanced but that looks like way
> too much work, so meh.
Yeah, ew. We'll probably want the general rebalancer to be ported to
NIR, anyway.
>> + if (num_components < 2)
>> + unsupported(&instr->instr);
>> + vec_instr = nir_alu_instr_create(mem_ctx, nir_op_map[num_components
>> - 2]);
>> + nir_alu_ssa_dest_init(vec_instr, num_components);
>> + }
>> +
>> + /* Walk from the end of the channels, so our incremental inserts after
>> the
>> + * original instruction end up in a sensible xyzw order.
>> + */
>> + for (chan = 0; chan < 4; chan++) {
>> + if (!(lower_chans & (1 << chan))) {
>> + if (instr->dest.dest.is_ssa &&
>> + chan < instr->dest.dest.ssa.num_components) {
>> + unsupported(&instr->instr);
>> + }
>> + continue;
>> + }
>> +
>> + nir_alu_instr *lower = nir_alu_instr_create(mem_ctx, instr->op);
>> + for (i = 0; i < num_src; i++) {
>> + /* bcsel and fcsel reuse the same src channel in src[0]. */
>>
>
> Stale comment. This isn't the case for bcsel or fcsel anymore
Fixed.
>> + unsigned src_chan = (nir_op_infos[instr->op].input_sizes[i] == 1
>> ?
>> + 0 : chan);
>>
>
> I think you want input_sizes[i] != 0 here. We could have an instruction
> that takes a vec2 as one component but is otherwise vectorized.
I don't even know what this path could reasonably do with a 2-component
src with a >2 component dest, though. Leave the original 2-component
swizzle in place? This path removes all other vector inputs from ALU
ops, so it seems like a funny theoretical operation to be trying to
handle here.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150122/16480756/attachment.sig>
More information about the mesa-dev
mailing list