[Mesa-dev] [PATCH 01/13] i965: Factor out logic to build a send message instruction with indirect descriptor.

Francisco Jerez currojerez at riseup.net
Fri Mar 6 03:57:35 PST 2015


"Pohjolainen, Topi" <topi.pohjolainen at intel.com> writes:

> On Fri, Mar 06, 2015 at 10:37:06AM +0200, Pohjolainen, Topi wrote:
>> On Fri, Feb 27, 2015 at 05:34:44PM +0200, Francisco Jerez wrote:
>> >[..]
>> > +/**
>> > + * Send message to shared unit \p sfid with a possibly indirect descriptor \p
>> > + * desc.  If the descriptor is not an immediate it will be transparently
>> > + * loaded to an address register using an OR instruction that will be returned
>> > + * to the caller so additional descriptor bits can be specified with the usual
>> > + * brw_set_*_message() helper functions.
>> > + */
>
> Right, you exploit this in patch number five.

I exploit it in the generator code for all typed and untyped surface
opcodes, and was considering to take more advantage from it in the
texturing and pull constant load opcode -- it will likely reduce the
amount of code required to emit such messages to less than one third.

> I think at least this comment is misleading as it doesn't say anything
> about the returned instruction in case the given descriptor is an
> immediate.
>
I've edited the comment locally, hopefully it's more obvious now:

| /**
|  * Send message to shared unit \p sfid with a possibly indirect descriptor \p
|  * desc.  If \p desc is not an immediate it will be transparently loaded to an
|  * address register using an OR instruction.  The returned instruction can be
|  * passed as argument to the usual brw_set_*_message() functions in order to
|  * specify any additional descriptor bits -- If \p desc is an immediate this
|  * will be the SEND instruction itself, otherwise it will be the OR
|  * instruction.
|  */

> All in all I'm not too happy about the return value having such
> differing semantics depending on the given descriptor type.
>

The point is that all the semantics that callers of this function care
about is being able to set descriptor control bits on the returned
instruction, they don't care about the actual SEND instruction itself.
Consider fs_generator::generate_varying_pull_constant_load_gen7:

|   if (index.file == BRW_IMMEDIATE_VALUE) {
|
|      uint32_t surf_index = index.dw1.ud;
|
|      brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND);
|      brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW));
|      brw_set_src0(p, send, offset);
|      brw_set_sampler_message(p, send,
|                              surf_index,
|                              0, /* LD message ignores sampler unit */
|                              GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
|                              rlen,
|                              mlen,
|                              false, /* no header */
|                              simd_mode,
|                              0);
|
|      brw_mark_surface_used(prog_data, surf_index);
|
|   } else {
|
|      struct brw_reg addr = vec1(retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD));
|
|      brw_push_insn_state(p);
|      brw_set_default_mask_control(p, BRW_MASK_DISABLE);
|      brw_set_default_access_mode(p, BRW_ALIGN_1);
|
|      /* a0.0 = surf_index & 0xff */
|      brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND);
|      brw_inst_set_exec_size(p->brw, insn_and, BRW_EXECUTE_1);
|      brw_set_dest(p, insn_and, addr);
|      brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD)));
|      brw_set_src1(p, insn_and, brw_imm_ud(0x0ff));
|
|
|      /* a0.0 |= <descriptor> */
|      brw_inst *insn_or = brw_next_insn(p, BRW_OPCODE_OR);
|      brw_set_sampler_message(p, insn_or,
|                              0 /* surface */,
|                              0 /* sampler */,
|                              GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
|                              rlen /* rlen */,
|                              mlen /* mlen */,
|                              false /* header */,
|                              simd_mode,
|                              0);
|      brw_inst_set_exec_size(p->brw, insn_or, BRW_EXECUTE_1);
|      brw_inst_set_src1_reg_type(p->brw, insn_or, BRW_REGISTER_TYPE_UD);
|      brw_set_src0(p, insn_or, addr);
|      brw_set_dest(p, insn_or, addr);
|
|
|      /* dst = send(offset, a0.0) */
|      brw_inst *insn_send = brw_next_insn(p, BRW_OPCODE_SEND);
|      brw_set_dest(p, insn_send, retype(dst, BRW_REGISTER_TYPE_UW));
|      brw_set_src0(p, insn_send, offset);
|      brw_set_indirect_send_descriptor(p, insn_send, BRW_SFID_SAMPLER, addr);
|
|      brw_pop_insn_state(p);
|
|      /* visitor knows more than we do about the surface limit required,
|       * so has already done marking.
|       */
|   }

This allows us to simplify the whole snippet into:

|   brw_inst *insn = brw_send_indirect_surface_message(
|      p, BRW_SFID_SAMPLER, dst, offset, index, mlen, rlen, false);
|   brw_set_sampler_message(p, insn,
|                           0 /* surface */,
|                           0 /* sampler */,
|                           GEN5_SAMPLER_MESSAGE_SAMPLE_LD,
|                           0 /* rlen */,
|                           0 /* mlen */,
|                           false /* header */,
|                           simd_mode,
|                           0);
|
|   if (index.file == BRW_IMMEDIATE_VALUE)
|      brw_mark_surface_used(prog_data, index.dw1.ud);

The saving in LOC is huge if you multiply by the number of message sends
with indirect descriptor (2 texturing messages, 3 pull constant
messages, 6 untyped and typed surface messages).

>> >[..]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150306/97a51a93/attachment.sig>


More information about the mesa-dev mailing list