[Mesa-dev] [RFC PATCH 03/26] glsl: introduce new base types for bindless samplers/images

Wed Apr 19 09:09:49 UTC 2017

On 19.04.2017 10:10, Samuel Pitoiset wrote:
>
>
> On 04/19/2017 08:40 AM, Nicolai Hähnle wrote:
>> On 19.04.2017 00:01, Ilia Mirkin wrote:
>>> On Tue, Apr 18, 2017 at 5:26 PM, Nicolai Hähnle <nhaehnle at gmail.com>
>>> wrote:
>>>> On 18.04.2017 21:49, Dave Airlie wrote:
>>>>>
>>>>> On 19 April 2017 at 05:30, Samuel Pitoiset <samuel.pitoiset at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 04/18/2017 08:14 PM, Nicolai Hähnle wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 11.04.2017 18:48, Samuel Pitoiset wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Bindless sampler/image types are really different from the existing
>>>>>>>> sampler/image types. They are considered 64-bit unsigned integers,
>>>>>>>> they can be declared as temporary, shader inputs/outputs and are
>>>>>>>> non-opaque types.
>>>>>>>>
>>>>>>>> For these reasons, it looks more convenient to introduce new
>>>>>>>> internal base types to the GLSL compiler, called
>>>>>>>> GLSL_TYPE_BINDLESS_SAMPLER and respectively
>>>>>>>> GLSL_TYPE_BINDLESS_IMAGE.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Sorry for taking so long to get to this series, but could you
>>>>>>> explain
>>>>>>> the
>>>>>>> rationale here a bit more?
>>>>>>
>>>>>>
>>>>>>
>>>>>> No worries, all feedbacks are always welcome, even late. :)
>>>>>>
>>>>>> So, the bindless sampler/image types introduced in
>>>>>> ARB_bindless_texture
>>>>>> are
>>>>>> really different from the "standard" ones.
>>>>>>
>>>>>> They are considered as 64-bit unsigned integers (not 32-bit) and
>>>>>> they are
>>>>>> non-opaque types. The latter means they can be declared as temporary
>>>>>> variables, as shader inputs/outputs, inside an interface block, etc.
>>>>
>>>>
>>>> I don't know that that's the best way to think about it at the GLSL
>>>> level.
>>>> Mostly they're still opaque types, it's just that they can be
>>>> explicitly
>>>> converted to/from 64-bit ints (or uvec2... not having an interaction
>>>> with
>>>> GL_ARB_gpu_shader_int64 seems like a spec oversight).
>>>>
>>>>
>>>>>> That said, the current sampler/image types are opaque (cf,
>>>>>> glsl_type::is_opaque()) and it seemed quite impossible to change the
>>>>>> glsl_type helpers to fit my needs.
>>>>
>>>>
>>>> I see no is_opaque, maybe you mean contains_opaque? I agree that it's
>>>> annoying that the restrictions expressed in ast_to_hir.cpp need to be
>>>> modified. What other helpers are an issue?
>>>>
>>>>
>>>>>> I tried many different solutions before figuring this one which seems
>>>>>> better
>>>>>> for some reasons:
>>>>>>
>>>>>> - easy to make bindless sampler/image types 64-bit unsigned int
>>>>>> - easy to make bindless sampler/image types non-opaque
>>>>>> - should avoid breakage because the base type is different
>>>>>> - reduce the amount of changes in most places in the compiler
>>>>
>>>>
>>>> I still don't see a *positive* justification for having the two
>>>> different
>>>> type families. Where do you actually need to *distinguish* between
>>>> bindless
>>>> and bound samplers in the compiler? The spec really doesn't make that
>>>> distinction at all, except in one single place:
>>>>
>>>>    "These modifiers control whether default-block uniforms of
>>>>     the corresponding types may have their values set via both
>>>>     UniformHandle* and Uniform1i (bindless_sampler and
>>>>     bindless_image) or only via Uniform1i (bound_sampler and
>>>>     bound_image)."
>>>>
>>>> It seems like the only place to distinguish between the two is actually
>>>> *outside* the compiler, in the uniforms API. And that's not even
>>>> really have
>>>> about the *type* of the variable; it's about the variable's memory
>>>> layout.
>>>> So it really behaves more similarly to the layout qualifiers (offset,
>>>> alignment) you have in SSBO and UBO blocks rather than a property of
>>>> a type.
>>>>
>>>> So as I see it, this code is messing with the type system for no good
>>>> reason, and doing so in a fundamentally incoherent way that goes
>>>> against all
>>>> that is good and sane in compiler design. And I think it shows.
>>>> Consider the
>>>> following example shader:
>>>>
>>>>    layout (bound_sampler) uniform sampler2D tex_bound;
>>>>    layout (bindless_sampler) uniform sampler2D tex_bindless;
>>>>
>>>>    vec4 f(sampler2D s, ...)
>>>>    {
>>>>       return texture2D(s, ...);
>>>>    }
>>>>
>>>>    void main()
>>>>    {
>>>>       vec4 a = f(tex_bound, ...);
>>>>       vec4 b = f(tex_bindless, ...);
>>>>    }
>>>>
>>>> What's the type of the parameter `s' of function `f'? Will this
>>>> compile, and
>>>> if so, how?
>>>>
>>>> I guess the answer is: as is, the type is sampler2DBindless, which
>>>> makes no
>>>> sense whatsoever if the function f appears in a shader which *doesn't*
>>>> enable ARB_bindless_texture (which is allowed!). But having the type
>>>> of f
>>>> depend on whether the extension is enabled makes even *less* sense.
>>>>
>>>> And does it compile? Well, I don't recall seeing code that would
>>>> automatically do a type conversion -- you probably either end up
>>>> generating
>>>> an incorrect compiler error, or you end up with IR that assigns a
>>>> value of
>>>> bindless-type to a variable of non-bindless-type, or vice versa.
>>>>
>>>> None of this gives me confidence from a compiler design perspective.
>>>> What's
>>>> left are some vague implementation concerns that I don't understand
>>>> yet,
>>>> admittedly in part because I haven't tried to implement the
>>>> alternative :-)
>>>>
>>>>
>>>>>> At the Gallium level, the changes are really small. Basically, if the
>>>>>> type
>>>>>> is a bindless sampler, the ir_dereference variable is visited and
>>>>>> it can
>>>>>> be
>>>>>> considered as PROGRAM_UNIFORM or PROGRAM_TEMPORARY (for shader
>>>>>> inputs/outputs).
>>>>>>
>>>>>> Hopefully, you are convinced now. :)
>>>>>
>>>>>
>>>>> When I did my initial implementation I went with merged types, I don't
>>>>> think
>>>>> it was a good idea either. Apart from making old non-bindless
>>>>> samples take
>>>>> up
>>>>> twice as much space, there's a lot or corner cases.
>>>>
>>>>
>>>> Non-bindless samplers taking up twice as much space for the
>>>> accounting of
>>>> the default uniform block is something the spec explicitly calls out as
>>>> being okay, though.
>>>>
>>>> Even if we decide that it isn't, it's really something that belongs
>>>> into
>>>> struct gl_uniform_storage during linking, because it's clearly a
>>>> property
>>>> of... uniform storage layout.
>>>>
>>>>
>>>>> Unless we hit some problem in the future where they need to be the
>>>>> same
>>>>> type I'd say this is preferable.
>>>>
>>>>
>>>> I hope the code sample above is convincing evidence that using
>>>> different
>>>> types for the same thing just makes no sense.
>>>
>>> That's one quirk of this approach - you have to create a combinatorial
>>> explosion of user (and built-in, as you've seen) functions for each
>>> sampler they accept. So if the function takes 5 sampler arguments, you
>>> have to generate 32 copies. However I suspect such functions are rare.
>>> Not sure if Samuel did this, but if he didn't, that's a necessary
>>> component of this approach.
>>>
>>> Having it all be one type definitely creates a lot of complications
>>> which you're not seeing here as they're all avoided. Neither approach
>>> is perfect, but I think at the cost of the extra types you end up with
>>> a much more understandable compiler. Here are some odd cases to deal
>>> with for the single type approach:
>>>
>>> 1. local variables: so doing something like
>>>
>>>     sampler2D foo = bar;
>>>
>>> is not legal without bindless. So imagine you have a function that
>>> uses a sampler local variable,
>>>
>>>     vec4 f(sampler2D s, ...)
>>>     {
>>>        sampler2D tex = foo ? s : t;
>>>        return texture2D(tex, ...);
>>>     }
>>>
>>> This function can implicitly not be called with a bound sampler.
>>> Expressing this in a way that the compiler can understand seems like
>>> it couldn't be done without the types being different, or some deeper
>>> analysis being done on the uses of the function's arguments. With the
>>> type explosion approach, this magically works out since there's no
>>> ambiguity about whether a particular sampler is bindless or not.
>>
>> I'm not sure what you're saying here. I suspect we're actually not in
>> agreement about what the spec allows. My understanding is that calling
>> that function f with a sampler taken from a bound_sampler uniform is
>> perfectly legal, i.e. the following code should compile and work as
>> expected:
>>
>>    layout (bound_sampler) uniform sampler2D tex;
>>
>>    ...
>>
>>    f(tex, ...);
>
> How this can work? Passing a uniform into out or inout parameter doesn't
> seem to be allowed. At least, the following code doesn't even compile
> with NV.
>
> #version 450
> #extension GL_ARB_bindless_texture: require
>
> layout (bound_sampler) uniform sampler2D tex;
>
> void f(inout sampler2D p) {}
>
> void main()
> {
>     f(tex);
> }

Well, uniforms are read-only variables, so you cannot assign to them 
either directly or by passing them to an (in)out parameter. So yes, this 
code shouldn't compile. It has nothing to do with bindless vs. bound.

In Ilia's example, p was a plain (in) parameter.

>> When the compiler sees a local variable assignment
>>
>>    sampler2D foo = bar;
>>
>> the legality can be decided locally. It is always okay when the
>> bindless extension is enabled, and it's never okay without. Again, the
>> "bound_sampler" is not a property of the sampler value itself, it is
>> "merely" a property of how the sampler is stored (memory layout).
>
> How about interface blocks, structures, inputs/outputs?

What about them specifically? You can never assign to inputs; again, 
this is unrelated to bindless vs. bound. When ARB_bindless_texture is 
enabled, you can *always* assign to out sampler variables, regardless of 
where the right-hand side of the assignment comes from.

For example, the following code should be valid in a vertex shader:

   layout (bound_sampler) uniform sampler2D u_sampler;
   out sampler2D out_sampler;

   void main()
   {
     out_sampler = u_sampler;
   }

> Yeah, we could
> also decide locally (I'm sure), but having something like:
>
> if (sampler && bindless)
>   allowed
> else
>   disallowed
>
> in many places in the compiler looks quite ugly.

As compared to the ugliness of type duplication everywhere?

>> At least I haven't seen anything in the spec to contradict that.
>>
>> Side note: As far as I can tell, you're even allowed to do:
>>
>>    layout (bound_sampler) uniform sampler2D tex;
>>
>>    ...
>>
>>    uvec2 handle = uvec2(tex);
>
> I think this is correct as well, at least it compiles with NV. However,
> it makes just no sense to me. The spec says:
>
> "The error INVALID_OPERATION is generated by UniformHandleui64{v}ARB if
> the sampler or image uniform being updated has the "bound_sampler" or
> "bound_image" layout qualifier.""
>
> So, you can only assign a 32-bit uniform here, converting to a pair of
> 32-bit will make the .y component 0.

The way I understand it, converting to uvec2 should give a pointer to 
the descriptor (at least for the GCN backend).

So bound_samplers are implemented as simply old-style samplers living in 
the good old descriptor table. When they are converted to uvec2, we need 
to emit code that returns a pointer into that descriptor table. That's a 
fully 64-bit value.

Cheers,
Nicolai

>> Personally, I think that makes a lot of sense at least for GCN:
>> texture handles are simply pointers to descriptors. Unfortunately, it
>> seems the spec isn't very explicit about that, and it's also not very
>> explicit about the extent of validity of the returned handle. Though I
>> think you can kind of infer that the handle is not expected to be
>> valid outside the shader invocation, since only handles returned by
>> Get{Texture,TextureSampler}HandleARB are said to be "stable".
>>
>>
>>> Furthermore, even without the separate function issue, the fact that
>>> local variables are bindless-only creates a host of issues for
>>> figuring out whether an expression is legal. Perhaps this is doable
>>> with a deep scan of any expression assigning to a local variable. But
>>> things with temporary expression like texture2D(foo ? s : t, ...)
>>> could still be annoying. (All solvable, I'm sure.)
>>>
>>> 2. uniform storage: this is a weaker argument, but I feel like it
>>> needs to be said anyways -- the uniform layout code is subtle and
>>> quick to anger. Even if it is modified appropriately, seems unlikely
>>> for there to be an easily achievable way that non-bindless GPUs are
>>> allowed to not have to waste uniform space. IIRC these uniforms are
>>> allocated real uniform storage (despite never being used), which will
>>> use up precious uniform space for DX9 and earlier GPUs. However if
>>> this were the only thing, it'd be too weak. More of a "pile on top"
>>> kind of thing.
>>>
>>> FWIW I'm in favor of the separate type approach, assuming that we all
>>> agree that the user function explosion is acceptable. IMHO it is
>>> though. [And I had been discussing this approach and the alternative
>>> of doing a single type with Samuel for a while before this patchset
>>> was sent out.]
>>
>> Yeah, from your email I get the impression that there's a genuine
>> conflict of readings of the spec. If you're right and I'm indeed
>> reading the spec wrong, I'd be inclined to agree with you. What you're
>> describing in point 1 really *are* type-level concerns. I just think
>> that I'm right about the spec and you're not ;-)
>>
>> FWIW, a big part of what makes me quite certain is that the spec
>> *replaces* section 4.1.7 (Samplers) in its entirety rather than adding
>> to it.
>
> Well, if we have questions I can ask the spec authors again (they seem
> to be quite responsive). I think we might need some clarification...
>
> Cheers.
>
>>
>> Cheers,
>> Nicolai
>>
>>>
>>> Cheers,
>>>
>>>   -ilia
>>>
>>
>>

-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.