[Mesa-dev] [RFC] i965: Factor out descriptor building for indirect send messages

Francisco Jerez currojerez at riseup.net
Thu Mar 12 06:13:48 PDT 2015


"Pohjolainen, Topi" <topi.pohjolainen at intel.com> writes:

> On Tue, Mar 10, 2015 at 11:07:26PM +0200, Francisco Jerez wrote:
>> "Pohjolainen, Topi" <topi.pohjolainen at intel.com> writes:
>> 
>> > On Mon, Mar 09, 2015 at 12:43:08PM +0200, Francisco Jerez wrote:
>> >> "Pohjolainen, Topi" <topi.pohjolainen at intel.com> writes:
>> >> 
>> >> > On Sat, Mar 07, 2015 at 04:15:08PM +0200, Francisco Jerez wrote:
>> >> >> Topi Pohjolainen <topi.pohjolainen at intel.com> writes:
>> >> >> 
>> >> >> > The original patch from Curro was based on something that is not
>> >> >> > present in the master yet. This patch tries to mimick the logic on
>> >> >> > top master.
>> >> >> > I wanted to see if could separate the building of the descriptor
>> >> >> > instruction from building of the send instruction. This logic now
>> >> >> > allows the caller to construct any kind of sequence of instructions
>> >> >> > filling in the descriptor before giving it to the send instruction
>> >> >> > builder.
>> >> >> >
>> >> >> > This is only compile tested. Curro, how would you feel about this
>> >> >> > sort of approach? I apologise for muddying the waters again but I
>> >> >> > wasn't entirely comfortable with the logic and wanted to try to
>> >> >> > something else.
>> >> >> >
>> >> >> > I believe patch number five should go nicely on top of this as
>> >> >> > the descriptor instruction could be followed by (or preceeeded by)
>> >> >> > any additional instructions modifying the descriptor register
>> >> >> > before the actual send instruction.
>> >> >> >
>> >> >> 
>> >> >> Topi, the goal I had in mind with PATCH 01 was to refactor a commonly
>> >> >> recurring pattern.  In terms of the functions defined in this patch my
>> >> >> example from yesterday [1] would now look like:
>> >> >> 
>> >> >> |   if (index.file == BRW_IMMEDIATE_VALUE) {
>> >> >> |
>> >> >> |      uint32_t surf_index = index.dw1.ud;
>> >> >> |
>> >> >> |      brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND);
>> >> >> |      brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW));
>> >> >> |      brw_set_src0(p, send, offset);
>> >> >> |      brw_set_sampler_message(p, send,
>> >> >> |                              surf_index,
>> >> >> |                              0, /* LD message ignores sampler unit */
>> >> >> |                              GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
>> >> >> |                              rlen,
>> >> >> |                              mlen,
>> >> >> |                              false, /* no header */
>> >> >> |                              simd_mode,
>> >> >> |                              0);
>> >> >> |
>> >> >> |      brw_mark_surface_used(prog_data, surf_index);
>> >> >> |
>> >> >> |   } else {
>> >> >> |
>> >> >> |      struct brw_reg addr = vec1(retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD));
>> >> >> |
>> >> >> |      brw_push_insn_state(p);
>> >> >> |      brw_set_default_mask_control(p, BRW_MASK_DISABLE);
>> >> >> |      brw_set_default_access_mode(p, BRW_ALIGN_1);
>> >> >> |
>> >> >> |      /* a0.0 = surf_index & 0xff */
>> >> >> |      brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND);
>> >> >> |      brw_inst_set_exec_size(p->brw, insn_and, BRW_EXECUTE_1);
>> >> >> |      brw_set_dest(p, insn_and, addr);
>> >> >> |      brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD)));
>> >> >> |      brw_set_src1(p, insn_and, brw_imm_ud(0x0ff));
>> >> >> |
>> >> >> |
>> >> >> |      /* a0.0 |= <descriptor> */
>> >> >> |      brw_inst *descr_inst = brw_build_indirect_message_descr(p, addr, addr);
>> >> >> |      brw_set_sampler_message(p, descr_inst,
>> >> >> |                              0 /* surface */,
>> >> >> |                              0 /* sampler */,
>> >> >> |                              GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
>> >> >> |                              rlen /* rlen */,
>> >> >> |                              mlen /* mlen */,
>> >> >> |                              false /* header */,
>> >> >> |                              simd_mode,
>> >> >> |                              0);
>> >> >> |
>> >> >> |      /* dst = send(offset, a0.0) */
>> >> >> |      brw_send_indirect_message(p, BRW_SFID_SAMPLER, dst, offset, addr);
>> >> >> |
>> >> >> |      brw_pop_insn_state(p);
>> >> >> |
>> >> >> |      /* visitor knows more than we do about the surface limit required,
>> >> >> |       * so has already done marking.
>> >> >> |       */
>> >> >> |   }
>> >> >
>> >> > Which I think could also be written as follows. Or am I missing something
>> >> > again?
>> >> >
>> >> > static brw_inst *
>> >> > brw_build_surface_index_descr(struct brw_compile *p,
>> >> >                               struct brw_reg dst, index)
>> >> > {
>> >> >    brw_set_default_mask_control(p, BRW_MASK_DISABLE);
>> >> >    brw_set_default_access_mode(p, BRW_ALIGN_1);
>> >> >
>> >> >    /* a0.0 = surf_index & 0xff */
>> >> >    brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND);
>> >> >    brw_inst_set_exec_size(p->brw, insn_and, BRW_EXECUTE_1);
>> >> >    brw_set_dest(p, insn_and, addr);
>> >> >    brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD)));
>> >> >    brw_set_src1(p, insn_and, brw_imm_ud(0x0ff));
>> >> >
>> >> >    /* a0.0 |= <descriptor> */
>> >> >    brw_inst *descr_inst = brw_build_indirect_message_descr(p, addr, addr);
>> >> > }
>> >> >
>> >> > ...
>> >> >    brw_inst *descr_inst;
>> >> >    if (index.file == BRW_IMMEDIATE_VALUE) {
>> >> >       descr = brw_next_insn(p, BRW_OPCODE_SEND);
>> >> >       brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW));
>> >> >       brw_set_src0(p, send, offset);
>> >> >
>> >> >       brw_mark_surface_used(prog_data, surf_index);
>> >> >    } else {
>> >> >       struct brw_reg addr = vec1(retype(brw_address_reg(0),
>> >> >                                  BRW_REGISTER_TYPE_UD));
>> >> >       brw_push_insn_state(p);
>> >> >
>> >> >       brw_build_surface_index_descr(p, addr, index);
>> >> >       /* dst = send(offset, a0.0) */
>> >> >       descr_inst = brw_send_indirect_message(p, BRW_SFID_SAMPLER,
>> >> >                                              dst, offset, addr);
>> >> >       brw_pop_insn_state(p);
>> >> >    }
>> >> >
>> >> >    uint32_t surf_index = index.file == BRW_IMMEDIATE_VALUE ? index.dw1.ud : 0;
>> >> >    brw_set_sampler_message(p, descr_inst,
>> >> >                            surf_index,
>> >> >                            0, /* LD message ignores sampler unit */
>> >> >                            GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
>> >> >                            rlen,
>> >> >                            mlen,
>> >> >                            false, /* no header */
>> >> >                            simd_mode,
>> >> >                            0);
>> >> > ...
>> >> 
>> >> Yeah, something like that.  In any case you need logic to handle both
>> >> the immediate and indirect cases explicitly by every caller, which seems
>> >> like plenty of duplication to me, not sure what you are trying to
>> >> achieve with this.
>> >
>> > Well, to me it seems that cases where address is an immediate and 
>> > non-immediate already have quite fixed paths. I'm uneasy about changing the
>> > final kick-off of the message sending to have two alternatives for these
>> > - in each path only one final branch is really possible but this isn't that
>> > obvious to the reader anymore as one needs to go and analyze the callers:
>> >
>> 
>> No, that's not right, *all* users of brw_send_indirect_message() or
>> brw_send_indirect_surface_message() will eventually have to handle
>> *both* immediate and indirect descriptor cases, and the necessary logic
>> is identical for *all* of them.
>
> Some rational in the commit message would have helped me to understand the
> ultimate goal, same as with generate_tex().
> I'll continue with patch number five again. There are quite few things moving
> around making it difficult to double check.
>
OK, I'll write a short explanation in the commit message.

> Also, this series is not based on master but something else. What is your
> plan on landing this?
>
Right, it was based on my constant cache series, I've just rebased it on
top of master.

>> 
>> > Patch 1:
>> >   - introduces brw_send_indirect_message() considering if address is
>> >     immediate or not.
>> >   - called by generate_tex() against address that is never an immediate
>> 
>> This is only temporary, I have a (not yet sent for review) patch that
>> gets rid of explicit handling of immediate vs. indirect descriptors in
>> the texturing and pull constant code completely.
>> 
>> >   - called by generate_uniform_pull_constant_load_gen7() in a branch which
>> >     explicitly uses an immediate and another which explicitly uses
>> >     non-immediate
>> >
>> I don't think that any immediate-only branch of
>> generate_uniform_pull_constant_load_gen7() is calling
>> brw_send_indirect_message() at this point.
>> 
>> > Patch 5:
>> >   - introduces brw_send_indirect_surface_message() considering if address is
>> >     immediate or not.
>> >   - called by brw_untyped_atomic() and brw_untyped_surface_read() which
>> >     both are only called against immediates
>> >
>> > Patch 11:
>> >   - introduces brw_untyped_surface_write() which is only called against
>> >     immediate
>> >
>> > Patch 12:
>> >   - introduces brw_typed_atomic() which is only called against immediate
>> >   - introduces brw_typed_surface_read() which is only called against immediate
>> >   - introduces brw_typed_surface_write() which is only called against immediate
>> >
>> Typed and untyped surface opcodes are *all* called with both immediate
>> and indirect surfaces.
>> 
>> >
>> >
>> > Then, in general I was hoping to have design pattern of this sort. Caller:
>> >
>> >    create_descriptor();
>> >    add_property_1_to_descriptor();
>> >    add_property_2_to_descriptor();
>> >    ...
>> >    send()
>> >
>> 
>> I guess I find this acceptable usually, but sticking to this model in
>> this case prevents abstraction, because the shared logic is only in the
>> first and last statements of your example.
>
> The property additions are shared as well, property_1 being the brw_OR()
> in brw_send_indirect_message().
>
>> 
>> >
>> > Instead of:
>> >
>> >   create_descriptor_and_send()
>> >   {
>> >      create();
>> >      send();
>> >   }
>> >
>> >   create_descriptor_with_1_and_send()
>> >   {
>> >      create_descriptor_and_send();
>> >      add_property_1();
>> >   }
>> >
>> 
>> If create_descriptor_with_1_and_send() is pseudocode for
>> brw_send_indirect_surface_message(), the only reason why it's there is
>> to emit the AND instruction that masks out invalid bits from the surface
>> index, which could otherwise lead to hangs.  I'm not planning on adding
>> more nesting levels.
>
> By that I meant brw_send_indirect_message() and
> brw_send_indirect_surface_message() being create_descriptor_with_1_and_send().
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150312/e0ed9d01/attachment-0001.sig>


More information about the mesa-dev mailing list