[Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v3)

Tue May 19 05:42:13 PDT 2015

Jason Ekstrand <jason at jlekstrand.net> writes:

> On Mon, May 18, 2015 at 10:34 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>>[...]
>> I've given this idea a shot.  Can you have a look at the
>> image-load-store-lower branch of my tree [1]?  It's just a quick and
>> dirty proof of concept, so don't bother to review it carefully, just let
>> me know if you agree with the general design before I spend more time on
>> it.
>>
>> [1] http://cgit.freedesktop.org/~currojerez/mesa/log/?h=image-load-store-lower
>
> I took a look at it.  I think patch 3 "Add pass to lower opcodes with
> unsupported SIMD width." is more-or-less exactly what I'm talking
> about.  What I don't understand is the stuff about split payloads.
> While I think we *might* be able to split a payload it seems dangerous
> and like something we shouldn't be doing.

Dangerous how?  Can you elaborate?

> This is where the "logical" opcodes I mentioned come into play.  I
> think there has been some miscommunication there; perhaps I didn't
> explain myself very well.  Allow me to be more explicit; I'll use
> image loads for my example.
>
>  1) We would add an opcode SHADER_IMAGE_LOAD_LOGICAL (or some other
> name) that takes 4 arguments: image, address, format, and dims just
> like the emit_image_load helper.
>  2) Instead of calling the helper, the visitor would just emit
> SHADER_IMAGE_LOAD_LOGICAL instruction with those arguments.
>  3) We then run the splitting pass which can easily split the new load
> instruction since no payloads are involved.
>  4) We then have a lowering pass which knows how to turn
> SHADER_IMAGE_LOAD_LOGICAL into an actual load including the payload,
> pixel mask, and whatever other fiddly bits there are.
>
> Steps (1) and (2) may not be quite right (you'll have to help me out
> here).  We may want to keep emit_image_load so that it can do format
> conversion and emit an untyped logical instruction.  However, in any
> case, the logical instruction does not have any payload sources if we
> can at all help it.
>
> Does that make more sense?  Is there something I'm missing?

I don't think that a high-level "image load" opcode would be of much use
in the back-end IR, the hardware can only do a number untyped and typed
surface operations, and we probably want to represent them as such.

My _SPLIT opcodes are roughly the same as the _LOGICAL opcodes you
describe -- as far as the visitor and optimization passes are concerned,
they both behave as a normal opcode taking an address, surface,
dimensions and size as separate arguments, the main difference is that
the lowering to a send-message-style opcode (your step 4) is fully
deterministic, as the layout of the message payload is inferred from the
source_is_payload(i) and regs_read(i) instruction queries.  This has two
obvious advantages:

1/ The same lowering logic can be reused for *all* send-message opcodes
   making use of this infrastructure, so there is no need to implement
   ad-hoc lowering logic for each message, which seemed like the
   greatest annoyance of your proposal.

2/ It could make the transition easier to Gen9 split send messages, as
   we could just change the one lowering pass to emit instructions with
   two partially assembled payload sources and let the hardware do the
   rest, in a way transparent to the visitor code making use of this
   infrastructure.

By doing this I can also easily avoid defining the array_reg stuff
others seemed to disagree with for some reason, although personally I
consider this more an obfuscation than an advantage (sigh).

> --Jason

Thanks.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150519/d63ac0a2/attachment.sig>