[Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v3)

Tue May 19 10:06:12 PDT 2015

On Tue, May 19, 2015 at 9:25 AM, Francisco Jerez <currojerez at riseup.net> wrote:
> Jason Ekstrand <jason at jlekstrand.net> writes:
>
>> On Tue, May 19, 2015 at 5:42 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>>> Jason Ekstrand <jason at jlekstrand.net> writes:
>>>
>>>> On Mon, May 18, 2015 at 10:34 AM, Francisco Jerez <currojerez at riseup.net> wrote:
>>>>>[...]
>>>>> I've given this idea a shot.  Can you have a look at the
>>>>> image-load-store-lower branch of my tree [1]?  It's just a quick and
>>>>> dirty proof of concept, so don't bother to review it carefully, just let
>>>>> me know if you agree with the general design before I spend more time on
>>>>> it.
>>>>>
>>>>> [1] http://cgit.freedesktop.org/~currojerez/mesa/log/?h=image-load-store-lower
>>>>
>>>> I took a look at it.  I think patch 3 "Add pass to lower opcodes with
>>>> unsupported SIMD width." is more-or-less exactly what I'm talking
>>>> about.  What I don't understand is the stuff about split payloads.
>>>> While I think we *might* be able to split a payload it seems dangerous
>>>> and like something we shouldn't be doing.
>>>
>>> Dangerous how?  Can you elaborate?
>>
>> It is not always the case that if you just leave the header alone and
>> split the others that you will get the payload you want for SIMD8.
>> More in a moment.
>>
>>>> This is where the "logical" opcodes I mentioned come into play.  I
>>>> think there has been some miscommunication there; perhaps I didn't
>>>> explain myself very well.  Allow me to be more explicit; I'll use
>>>> image loads for my example.
>>>>
>>>>  1) We would add an opcode SHADER_IMAGE_LOAD_LOGICAL (or some other
>>>> name) that takes 4 arguments: image, address, format, and dims just
>>>> like the emit_image_load helper.
>>>>  2) Instead of calling the helper, the visitor would just emit
>>>> SHADER_IMAGE_LOAD_LOGICAL instruction with those arguments.
>>>>  3) We then run the splitting pass which can easily split the new load
>>>> instruction since no payloads are involved.
>>>>  4) We then have a lowering pass which knows how to turn
>>>> SHADER_IMAGE_LOAD_LOGICAL into an actual load including the payload,
>>>> pixel mask, and whatever other fiddly bits there are.
>>>>
>>>> Steps (1) and (2) may not be quite right (you'll have to help me out
>>>> here).  We may want to keep emit_image_load so that it can do format
>>>> conversion and emit an untyped logical instruction.  However, in any
>>>> case, the logical instruction does not have any payload sources if we
>>>> can at all help it.
>>>>
>>>> Does that make more sense?  Is there something I'm missing?
>>>
>>> I don't think that a high-level "image load" opcode would be of much use
>>> in the back-end IR, the hardware can only do a number untyped and typed
>>> surface operations, and we probably want to represent them as such.
>>>
>>> My _SPLIT opcodes are roughly the same as the _LOGICAL opcodes you
>>> describe -- as far as the visitor and optimization passes are concerned,
>>> they both behave as a normal opcode taking an address, surface,
>>> dimensions and size as separate arguments, the main difference is that
>>> the lowering to a send-message-style opcode (your step 4) is fully
>>> deterministic, as the layout of the message payload is inferred from the
>>> source_is_payload(i) and regs_read(i) instruction queries.  This has two
>>> obvious advantages:
>>>
>>> 1/ The same lowering logic can be reused for *all* send-message opcodes
>>>    making use of this infrastructure, so there is no need to implement
>>>    ad-hoc lowering logic for each message, which seemed like the
>>>    greatest annoyance of your proposal.
>>
>> The fact that you can do that for untyped reads/writes is great.  It
>> means we should only need one lowering function for them.
>> Unfortunately, other messages such as FB writes aren't going to be
>> quite so simple.  I'm not sure what texturing will look like but I'll
>> hazard a guess that they won't be as trivial either.
>>
> No, FB writes and texturing both fit under the same framework just fine.
> I'll port them if people consider it useful.

I'd like to see them ported before I'll be fully convinced that it's
"just fine".
--Jason