[Mesa-dev] [PATCH] nir: Add an ALU op builder kind of like ir_builder.h

Connor Abbott cwabbott0 at gmail.com
Fri Feb 6 17:58:19 PST 2015


Hi Eric,

Sorry for not responding to this in a while. I'm replying to this
comment instead of the new version because it has the relevant
discussion.

On Mon, Feb 2, 2015 at 7:30 PM, Eric Anholt <eric at anholt.net> wrote:
> Connor Abbott <cwabbott0 at gmail.com> writes:
>
>> I mentioned this on IRC, but it would be good to add the ability to
>> append the sequence of instructions before/after an instruction as
>> well as at the beginning or end of a basic block. We would need to
>> store the current basic block, the current instruction, and an enum
>> consisting of "insert before instruction," "insert after instruction,"
>> insert at beginning of block," and "insert at end of block."
>
> I'm skeptical of doing this without a caller.  It should be easy to add
> when someone wants it, right?

Here are some cases where we could use the ALU builder if we extend it:

- in nir_lower_locals_to_regs when we generate the indexing expressions
- in nir_opt_peephole_select when we create a series of select instructions
- in nir_lower_atomics (more indexing expressions)
- in nir_lower_io (more indexing expressions)

In all these places, we have to generate a series of ALU instructions
after before some instruction or at the beginning of a basic block. I
found them just by grepping for nir_alu_instr_create. I expect there
will be more places like these in the future. Of course, this can be a
later cleanup series, so we don't have to add this functionality right
away.

>
>> Also, why isn't there a function for initializing the builder
>> structure?
>
> I dunno, same reason we don't have functions for initializing other
> structures like alu src ops?  Nobody wrote one that was simpler than
> just initializing the fields?

Eww... this isn't like those other cases, where the datastructures
(e.g. nir_alu_src) are directly exposed since the optimization passes
are manipulating them. The builder is an API, and we shouldn't be
exposing its internals to callers, especially if we want to extend it
like I mentioned above. And in this case, we can definitely do much
better than just initializing the fields anyways - we should be able
to specify the basic block or whatever to insert into and have it
figure out the nir_function_impl, nir_shader, and mem_ctx for you.

Anyways, we shouldn't be storing a control flow list in the
nir_builder struct, since all instructions are going to be inserted
into a basic block. The function for inserting an instruction after a
CF list is only an abstraction for callers like GLSL IR -> NIR that
prefer to think in terms of GLSL-style control flow instead of basic
blocks (for them, a CF list corresponds a series of statements in
GLSL) and inserting an instruction after a CF list just inserts it at
the end of the last basic block. So maybe in some cases we do want to
have the builder insert a series of instructions after a CF list, but
then the nir_builder constructor (say,
nir_builder_init_after_cf_list()) should internally translate that to
inserting at the end of the last basic block.

>
>>> +static inline nir_ssa_def *
>>> +nir_build_alu(struct nir_builder *build, nir_op op, nir_ssa_def *src0,
>>> +              nir_ssa_def *src1, nir_ssa_def *src2, nir_ssa_def *src3)
>>> +{
>>> +   nir_alu_instr *instr = nir_alu_instr_create(build->shader, op);
>>> +   if (!instr)
>>> +      return NULL;
>>> +
>>> +   instr->src[0].src = nir_src_for_ssa(src0);
>>> +   if (src1)
>>> +      instr->src[1].src = nir_src_for_ssa(src1);
>>> +   if (src2)
>>> +      instr->src[2].src = nir_src_for_ssa(src2);
>>> +   if (src3)
>>> +      instr->src[3].src = nir_src_for_ssa(src3);
>>> +
>>> +   unsigned num_components = nir_op_infos[op].output_size;
>>> +   if (num_components == 0)
>>> +      num_components = 4;
>>
>> We can and should do better here. We should look at the size
>> (num_components) of the per-component (input_size == 0) inputs and
>> make the output size the size of the largest input. Anything larger
>> would be pointless.
>
> This actually worked out, and it ended up being a little tiny codegen
> improvement.
>
>> - You don't need to use sorted() here, since unlike in nir_opcodes.h
>> you don't care about the order.
>> - You should use "for name, opcode in opcodes.iteritems(opcodes)" and
>> then use the opcode directly below instead of saying opcodes[name].
>
> I'm definitely retaining the sorted() -- code generation in hash table
> order makes things a disaster for trying to look at differences in code
> generation.  But I've adjusted to Ken's more pythonish style
> recommendation.

That's fine.


More information about the mesa-dev mailing list