[Mesa-dev] SSBO + access qualifiers

Mon Dec 14 20:57:46 PST 2015

On Mon, Dec 14, 2015 at 11:44 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> On Mon, Dec 14, 2015 at 8:03 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>> On Mon, Dec 14, 2015 at 10:50 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>>> On Mon, Dec 14, 2015 at 4:10 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>>> Hello,
>>>>
>>>> I was going to add support for the various volatile/etc annotations in
>>>> my gallium impl (which is fairly far along), but it seems like those
>>>> are currently being dropped on the floor by lower_ubo_references, and
>>>> NIR has no support for them either. Both in GLSL IR and NIR, the
>>>> variable is what holds the access annotation (volatile, etc). However
>>>> the ssbo intrinsics are all about a particular binding and offset,
>>>> without any reference to the variable.
>>>>
>>>> What's the right way of handling this? (And do any tests exist that
>>>> would be sensitive to getting such things wrong?)
>>>
>>> First off, why is it that way?  Well, because most of the IRs in mesa
>>> don't have a memory model capable of handling these sorts of things.
>>> Those that do (LLVM is the only one I'm aware of) can't handle the GL
>>> packing rules.  The result is that we basically have to lower
>>> everything away to byte-offset load/store intrinsics.
>>>
>>> What do we do about it?  My inclination would be to either add two new
>>> intrinsics for load/store_ssbo_volatile or to add a new constant
>>> boolean "volatile" parameter to the current intrinsics.  If a
>>> load/store happens on a volatile things, you get the volatile version,
>>> otherwise you get the plane version.  Then backends can know that they
>>> are free to reorder, combine, etc. non-volatile load/store operations
>>> as per their memory model and the provided barriers.  If they
>>> encounter a volatile load/store, they can't do anything with it and
>>> just have to do the memory op.
>>>
>>> Is that reasonable?
>>
>> Well, I don't really care much about the NIR-side of it, but I think
>> at the very least coherent and volatile need to be exposed. I don't
>> much care about restrict, that's a bit too fancy for me, but no harm
>> in passing it through. TBH I don't get the point of
>> readonly/writeonly. FWIW this is what the nvidia hw is capable of:
>> http://docs.nvidia.com/cuda/parallel-thread-execution/#cache-operators
>> (I know it's PTX, but the actual hw is basically the same.)
>>
>> I think that volatile = cv, coherent = cg, and everything else = ca/cs
>> based on some clever heuristic (aka I'll always pick one of those
>> until a situation where that doesn't work well arises).
>
> Right.  I don't actually know what we can do with our HW.  I was
> mostly thinking about what optimization passes can or cannot do with a
> variable.  It sounds like, whatever we do, you want it passed through
> somehow.  How about we just add a little well-defined bitfield as an
> extra argument to the SSBO load/store intrinsics?  The fact that it
> gets applied to a variable or a struct or a member or whatever is kind
> of a moot point to the backend compiler.

An extra argument with a mask or enum is precisely what I was
expecting to find when looking at the lower_ubo_references pass when I
was rather surprised to see that the whole thing had been overlooked.

Cheers,

  -ilia