[Mesa-dev] [RFC] i965: Factor out descriptor building for indirect send messages
Pohjolainen, Topi
topi.pohjolainen at intel.com
Tue Mar 10 13:30:30 PDT 2015
On Mon, Mar 09, 2015 at 12:43:08PM +0200, Francisco Jerez wrote:
> "Pohjolainen, Topi" <topi.pohjolainen at intel.com> writes:
>
> > On Sat, Mar 07, 2015 at 04:15:08PM +0200, Francisco Jerez wrote:
> >> Topi Pohjolainen <topi.pohjolainen at intel.com> writes:
> >>
> >> > The original patch from Curro was based on something that is not
> >> > present in the master yet. This patch tries to mimick the logic on
> >> > top master.
> >> > I wanted to see if could separate the building of the descriptor
> >> > instruction from building of the send instruction. This logic now
> >> > allows the caller to construct any kind of sequence of instructions
> >> > filling in the descriptor before giving it to the send instruction
> >> > builder.
> >> >
> >> > This is only compile tested. Curro, how would you feel about this
> >> > sort of approach? I apologise for muddying the waters again but I
> >> > wasn't entirely comfortable with the logic and wanted to try to
> >> > something else.
> >> >
> >> > I believe patch number five should go nicely on top of this as
> >> > the descriptor instruction could be followed by (or preceeeded by)
> >> > any additional instructions modifying the descriptor register
> >> > before the actual send instruction.
> >> >
> >>
> >> Topi, the goal I had in mind with PATCH 01 was to refactor a commonly
> >> recurring pattern. In terms of the functions defined in this patch my
> >> example from yesterday [1] would now look like:
> >>
> >> | if (index.file == BRW_IMMEDIATE_VALUE) {
> >> |
> >> | uint32_t surf_index = index.dw1.ud;
> >> |
> >> | brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND);
> >> | brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW));
> >> | brw_set_src0(p, send, offset);
> >> | brw_set_sampler_message(p, send,
> >> | surf_index,
> >> | 0, /* LD message ignores sampler unit */
> >> | GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
> >> | rlen,
> >> | mlen,
> >> | false, /* no header */
> >> | simd_mode,
> >> | 0);
> >> |
> >> | brw_mark_surface_used(prog_data, surf_index);
> >> |
> >> | } else {
> >> |
> >> | struct brw_reg addr = vec1(retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD));
> >> |
> >> | brw_push_insn_state(p);
> >> | brw_set_default_mask_control(p, BRW_MASK_DISABLE);
> >> | brw_set_default_access_mode(p, BRW_ALIGN_1);
> >> |
> >> | /* a0.0 = surf_index & 0xff */
> >> | brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND);
> >> | brw_inst_set_exec_size(p->brw, insn_and, BRW_EXECUTE_1);
> >> | brw_set_dest(p, insn_and, addr);
> >> | brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD)));
> >> | brw_set_src1(p, insn_and, brw_imm_ud(0x0ff));
> >> |
> >> |
> >> | /* a0.0 |= <descriptor> */
> >> | brw_inst *descr_inst = brw_build_indirect_message_descr(p, addr, addr);
> >> | brw_set_sampler_message(p, descr_inst,
> >> | 0 /* surface */,
> >> | 0 /* sampler */,
> >> | GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
> >> | rlen /* rlen */,
> >> | mlen /* mlen */,
> >> | false /* header */,
> >> | simd_mode,
> >> | 0);
> >> |
> >> | /* dst = send(offset, a0.0) */
> >> | brw_send_indirect_message(p, BRW_SFID_SAMPLER, dst, offset, addr);
> >> |
> >> | brw_pop_insn_state(p);
> >> |
> >> | /* visitor knows more than we do about the surface limit required,
> >> | * so has already done marking.
> >> | */
> >> | }
> >
> > Which I think could also be written as follows. Or am I missing something
> > again?
> >
> > static brw_inst *
> > brw_build_surface_index_descr(struct brw_compile *p,
> > struct brw_reg dst, index)
> > {
> > brw_set_default_mask_control(p, BRW_MASK_DISABLE);
> > brw_set_default_access_mode(p, BRW_ALIGN_1);
> >
> > /* a0.0 = surf_index & 0xff */
> > brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND);
> > brw_inst_set_exec_size(p->brw, insn_and, BRW_EXECUTE_1);
> > brw_set_dest(p, insn_and, addr);
> > brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD)));
> > brw_set_src1(p, insn_and, brw_imm_ud(0x0ff));
> >
> > /* a0.0 |= <descriptor> */
> > brw_inst *descr_inst = brw_build_indirect_message_descr(p, addr, addr);
> > }
> >
> > ...
> > brw_inst *descr_inst;
> > if (index.file == BRW_IMMEDIATE_VALUE) {
> > descr = brw_next_insn(p, BRW_OPCODE_SEND);
> > brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW));
> > brw_set_src0(p, send, offset);
> >
> > brw_mark_surface_used(prog_data, surf_index);
> > } else {
> > struct brw_reg addr = vec1(retype(brw_address_reg(0),
> > BRW_REGISTER_TYPE_UD));
> > brw_push_insn_state(p);
> >
> > brw_build_surface_index_descr(p, addr, index);
> > /* dst = send(offset, a0.0) */
> > descr_inst = brw_send_indirect_message(p, BRW_SFID_SAMPLER,
> > dst, offset, addr);
> > brw_pop_insn_state(p);
> > }
> >
> > uint32_t surf_index = index.file == BRW_IMMEDIATE_VALUE ? index.dw1.ud : 0;
> > brw_set_sampler_message(p, descr_inst,
> > surf_index,
> > 0, /* LD message ignores sampler unit */
> > GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
> > rlen,
> > mlen,
> > false, /* no header */
> > simd_mode,
> > 0);
> > ...
>
> Yeah, something like that. In any case you need logic to handle both
> the immediate and indirect cases explicitly by every caller, which seems
> like plenty of duplication to me, not sure what you are trying to
> achieve with this.
Well, to me it seems that cases where address is an immediate and
non-immediate already have quite fixed paths. I'm uneasy about changing the
final kick-off of the message sending to have two alternatives for these
- in each path only one final branch is really possible but this isn't that
obvious to the reader anymore as one needs to go and analyze the callers:
Patch 1:
- introduces brw_send_indirect_message() considering if address is
immediate or not.
- called by generate_tex() against address that is never an immediate
- called by generate_uniform_pull_constant_load_gen7() in a branch which
explicitly uses an immediate and another which explicitly uses
non-immediate
Patch 5:
- introduces brw_send_indirect_surface_message() considering if address is
immediate or not.
- called by brw_untyped_atomic() and brw_untyped_surface_read() which
both are only called against immediates
Patch 11:
- introduces brw_untyped_surface_write() which is only called against
immediate
Patch 12:
- introduces brw_typed_atomic() which is only called against immediate
- introduces brw_typed_surface_read() which is only called against immediate
- introduces brw_typed_surface_write() which is only called against immediate
Then, in general I was hoping to have design pattern of this sort. Caller:
create_descriptor();
add_property_1_to_descriptor();
add_property_2_to_descriptor();
...
send()
Instead of:
create_descriptor_and_send()
{
create();
send();
}
create_descriptor_with_1_and_send()
{
create_descriptor_and_send();
add_property_1();
}
create_descriptor_with_1_and_2_and_send()
{
create_descriptor_with_1_and_send()
add_property_2();
}
create_descriptor_with_1_and_2_and_3_and_send()
{
create_descriptor_with_2_and_send()
add_property_3();
}
More information about the mesa-dev
mailing list