[Mesa-dev] [PATCH 5/5] nir: Add nir_lower_alu_scalar.

Eric Anholt eric at anholt.net
Thu Jan 22 14:08:21 PST 2015


Jason Ekstrand <jason at jlekstrand.net> writes:

> Overall this looks correct.  I've got a few nits below and I'd like to take
> a look at it with fresh eyes before giving an R-B as it's complicated
> especially with all of the stuff to handle non-ssa.  Not sure if it's
> really worth doing non-ssa now that I see how much more complicated it
> makes things.
> --Jason
>
> On Wed, Jan 21, 2015 at 5:26 PM, Eric Anholt <eric at anholt.net> wrote:
>
>> This is the equivalent of brw_fs_channel_expressions.cpp, which I wanted
>> for vc4.

>> +static void
>> +reduce_op_replace(nir_alu_instr *instr, nir_ssa_def *def, void *mem_ctx)
>> +{
>> +   assert(instr->dest.write_mask == 1);
>> +
>> +   if (instr->dest.dest.is_ssa) {
>> +      nir_src new_src;
>> +      new_src.is_ssa = true;
>> +      new_src.ssa = def;
>> +      nir_ssa_def_rewrite_uses(&instr->dest.dest.ssa, new_src, mem_ctx);
>> +   } else {
>> +      nir_alu_instr *mov = nir_alu_instr_create(mem_ctx, nir_op_imov);
>> +      mov->src[0].src.is_ssa = true;
>> +      mov->src[0].src.ssa = def;
>> +
>> +      nir_alu_dest_copy(&mov->dest, &instr->dest, mem_ctx);
>> +      nir_instr_insert_after(&instr->instr, &mov->instr);
>
>
> I usually do insert_before when possible.  It will result in the same list
> (since we remove the instruction we're putting this before) but, since this
> happens as we're iterating over it, insert_before is a bit safer.  Maybe
> this is still safe; I'm not sure.

Done.

>> +   }
>> +
>> +   nir_instr_remove(&instr->instr);
>> +}
>> +
>> +static void
>> +lower_reduction(nir_alu_instr *instr, nir_op chan_op, nir_op merge_op,
>> +                void *mem_ctx)
>> +{
>> +   unsigned num_components = nir_op_infos[instr->op].input_sizes[0];
>> +
>> +   nir_ssa_def *last = NULL;
>> +   for (unsigned i = 0; i < num_components; i++) {
>> +      nir_alu_instr *chan = nir_alu_instr_create(mem_ctx, chan_op);
>> +      nir_alu_ssa_dest_init(chan, 1);
>> +      nir_alu_src_copy(&chan->src[0], &instr->src[0], mem_ctx);
>> +      chan->src[0].swizzle[0] = chan->src[0].swizzle[i];
>> +      if (nir_op_infos[chan_op].num_inputs > 1) {
>>
>
> assert num_inputs == 2 here?

Done.

>> +         nir_alu_src_copy(&chan->src[1], &instr->src[1], mem_ctx);
>> +         chan->src[1].swizzle[0] = chan->src[1].swizzle[i];
>> +      }
>> +
>> +      nir_instr_insert_before(&instr->instr, &chan->instr);
>> +
>> +      if (i == 0) {
>> +         last = &chan->dest.dest.ssa;
>> +      } else {
>> +         nir_alu_instr *merge = nir_alu_instr_create(mem_ctx, merge_op);
>> +         nir_alu_ssa_dest_init(merge, 1);
>> +         merge->dest.write_mask = 1;
>> +         merge->src[0].src.is_ssa = true;
>> +         merge->src[0].src.ssa = last;
>> +         merge->src[1].src.is_ssa = true;
>> +         merge->src[1].src.ssa = &chan->dest.dest.ssa;
>> +         nir_instr_insert_before(&instr->instr, &merge->instr);
>> +         last = &merge->dest.dest.ssa;
>>
>
> It might be nice if the tree were better balanced but that looks like way
> too much work, so meh.

Yeah, ew.  We'll probably want the general rebalancer to be ported to
NIR, anyway.

>> +      if (num_components < 2)
>> +         unsupported(&instr->instr);
>> +      vec_instr = nir_alu_instr_create(mem_ctx, nir_op_map[num_components
>> - 2]);
>> +      nir_alu_ssa_dest_init(vec_instr, num_components);
>> +   }
>> +
>> +   /* Walk from the end of the channels, so our incremental inserts after
>> the
>> +    * original instruction end up in a sensible xyzw order.
>> +    */
>> +   for (chan = 0; chan < 4; chan++) {
>> +      if (!(lower_chans & (1 << chan))) {
>> +         if (instr->dest.dest.is_ssa &&
>> +             chan < instr->dest.dest.ssa.num_components) {
>> +            unsupported(&instr->instr);
>> +         }
>> +         continue;
>> +      }
>> +
>> +      nir_alu_instr *lower = nir_alu_instr_create(mem_ctx, instr->op);
>> +      for (i = 0; i < num_src; i++) {
>> +         /* bcsel and fcsel reuse the same src channel in src[0]. */
>>
>
> Stale comment.  This isn't the case for bcsel or fcsel anymore

Fixed.

>> +         unsigned src_chan = (nir_op_infos[instr->op].input_sizes[i] == 1
>> ?
>> +                              0 : chan);
>>
>
> I think you want input_sizes[i] != 0 here.  We could have an instruction
> that takes a vec2 as one component but is otherwise vectorized.

I don't even know what this path could reasonably do with a 2-component
src with a >2 component dest, though.  Leave the original 2-component
swizzle in place?  This path removes all other vector inputs from ALU
ops, so it seems like a funny theoretical operation to be trying to
handle here.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150122/16480756/attachment.sig>


More information about the mesa-dev mailing list