[Mesa-dev] [PATCH 05/13] i965: Fix the untyped surface opcodes to deal with indirect surface access.
Kenneth Graunke
kenneth at whitecape.org
Sat Mar 7 01:06:03 PST 2015
On Friday, March 06, 2015 03:20:26 PM Pohjolainen, Topi wrote:
> On Fri, Mar 06, 2015 at 02:46:51PM +0200, Francisco Jerez wrote:
> > "Pohjolainen, Topi" <topi.pohjolainen at intel.com> writes:
> >
> > > On Fri, Mar 06, 2015 at 02:29:15PM +0200, Francisco Jerez wrote:
> > >> "Pohjolainen, Topi" <topi.pohjolainen at intel.com> writes:
> > >>
> > >> > On Fri, Feb 27, 2015 at 05:34:48PM +0200, Francisco Jerez wrote:
> > >> >> Change brw_untyped_atomic() and brw_untyped_surface_read() to take the
> > >> >> surface index as a register instead of a constant and to use
> > >> >> brw_send_indirect_message() to emit the indirect variant of send with
> > >> >> a dynamically calculated message descriptor. This will be required to
> > >> >> support variable indexing of image arrays for
> > >> >> ARB_shader_image_load_store.
> > >> >> ---
> > >> >> src/mesa/drivers/dri/i965/brw_eu.h | 10 +-
> > >> >> src/mesa/drivers/dri/i965/brw_eu_emit.c | 158 +++++++++++++----------
> > >> >> src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +-
> > >> >> src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 4 +-
> > >> >> 4 files changed, 96 insertions(+), 80 deletions(-)
> > >> >>
> > >> >> diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h
> > >> >> index 87a9f3f..9cc9123 100644
> > >> >> --- a/src/mesa/drivers/dri/i965/brw_eu.h
> > >> >> +++ b/src/mesa/drivers/dri/i965/brw_eu.h
> > >> >> @@ -398,18 +398,18 @@ void brw_CMP(struct brw_compile *p,
> > >> >>
> > >> >> void
> > >> >> brw_untyped_atomic(struct brw_compile *p,
> > >> >> - struct brw_reg dest,
> > >> >> + struct brw_reg dst,
> > >> >> struct brw_reg payload,
> > >> >> + struct brw_reg surface,
> > >> >> unsigned atomic_op,
> > >> >> - unsigned bind_table_index,
> > >> >> unsigned msg_length,
> > >> >> bool response_expected);
> > >> >>
> > >> >> void
> > >> >> brw_untyped_surface_read(struct brw_compile *p,
> > >> >> - struct brw_reg dest,
> > >> >> - struct brw_reg mrf,
> > >> >> - unsigned bind_table_index,
> > >> >> + struct brw_reg dst,
> > >> >> + struct brw_reg payload,
> > >> >> + struct brw_reg surface,
> > >> >> unsigned msg_length,
> > >> >> unsigned num_channels);
> > >> >>
> > >> >> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > >> >> index 0b655d4..34695bf 100644
> > >> >> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > >> >> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > >> >> @@ -2518,6 +2518,48 @@ brw_send_indirect_message(struct brw_compile *p,
> > >> >> return setup;
> > >> >> }
> > >> >>
> > >> >> +static struct brw_inst *
> > >> >> +brw_send_indirect_surface_message(struct brw_compile *p,
> > >> >> + unsigned sfid,
> > >> >> + struct brw_reg dst,
> > >> >> + struct brw_reg payload,
> > >> >> + struct brw_reg surface,
> > >> >> + unsigned message_len,
> > >> >> + unsigned response_len,
> > >> >> + bool header_present)
> > >> >> +{
> > >> >> + const struct brw_context *brw = p->brw;
> > >> >> + struct brw_inst *insn;
> > >> >> +
> > >> >> + if (surface.file != BRW_IMMEDIATE_VALUE) {
> > >> >> + struct brw_reg addr = retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD);
> > >> >> +
> > >> >> + brw_push_insn_state(p);
> > >> >> + brw_set_default_access_mode(p, BRW_ALIGN_1);
> > >> >> + brw_set_default_mask_control(p, BRW_MASK_DISABLE);
> > >> >> + brw_set_default_predicate_control(p, BRW_PREDICATE_NONE);
> > >> >> +
> > >> >> + /* Mask out invalid bits from the surface index to avoid hangs e.g. when
> > >> >> + * some surface array is accessed out of bounds.
> > >> >> + */
> > >> >> + insn = brw_AND(p, addr,
> > >> >> + suboffset(vec1(retype(surface, BRW_REGISTER_TYPE_UD)),
> > >> >> + BRW_GET_SWZ(surface.dw1.bits.swizzle, 0)),
> > >> >> + brw_imm_ud(0xff));
> > >> >> +
> > >> >> + brw_pop_insn_state(p);
> > >> >> +
> > >> >> + surface = addr;
> > >> >> + }
> > >> >> +
> > >> >> + insn = brw_send_indirect_message(p, sfid, dst, payload, surface);
> > >> >> + brw_inst_set_mlen(brw, insn, message_len);
> > >> >> + brw_inst_set_rlen(brw, insn, response_len);
> > >> >> + brw_inst_set_header_present(brw, insn, header_present);
> > >> >
> > >> > I'll continue the discussion we started with patch number one here if you
> > >> > don't mind. What I find confusing is that in case 'surface' is not an
> > >> > immediate then these three calls modify the OR-instruction. Otherwise they
> > >> > modify the send. Or am I missing something?
> > >>
> > >> Yeah, that's the whole point of the OR instruction, indirect message
> > >> sends no longer have an immediate source so all these control bits have
> > >> to be specified somewhere else. The caller doesn't care whether the
> > >> returned instruction is a SEND or some other opcode as long as it has
> > >> room for the control fields.
> > >
> > > This I understand, what I miss is the effect of setting mlen/rlen/header
> > > to an OR-instruction.
> >
> > Those are fields of the message descriptor that is usually part of the
> > immedate field of the SEND instruction. For indirect message sends they
> > have to be loaded to an address register together with the remaining
> > descriptor control bits, which is what the OR instruction does.
>
> Right, thanks for taking time explaining it. I think your logic make sense
> then - I think it is justified to assume that the reader knows how the
> send-mechanism itself works (which I didn't before you explaining it). If
> you like you could add short explanation to brw_send_indirect_message()
> telling how it works. That would make it easier for others to understand the
> rational between returning the send instruction itself and the descriptor.
This actually really confused me when Chris first sent in this code.
The key realization that helped me understand it was: for normal SEND
instructions, mlen, rlen, and other descriptor bits are all stored in
bits 127:96 - the same bits used for a 32-bit immediate value.
So all of our normal functions to set descriptor bits work - when used
on an OR, they simply help construct the src1 immediate value which the
OR then puts in the address register.
Normal sends just interpret their own src1 immediate value as the
descriptor.
After that, it actually made a lot of sense.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150307/348fc7e0/attachment.sig>
More information about the mesa-dev
mailing list