[Mesa-dev] [PATCH v5 25/70] glsl: Add std430 related member functions to glsl_type class

Wed Sep 16 01:52:32 PDT 2015

On 16/09/15 10:39, Ilia Mirkin wrote:
> On Wed, Sep 16, 2015 at 3:45 AM, Samuel Iglesias Gonsálvez
> <siglesias at igalia.com> wrote:
>>
>>
>> On 16/09/15 09:11, Ilia Mirkin wrote:
>>> On Wed, Sep 16, 2015 at 1:14 AM, Samuel Iglesias Gonsálvez
>>> <siglesias at igalia.com> wrote:
>>>>
>>>>
>>>> On 15/09/15 21:03, Jordan Justen wrote:
>>>>> On 2015-09-10 22:48:55, Samuel Iglesias Gonsálvez wrote:
>>>>>> On 10/09/15 20:13, Jordan Justen wrote:
>>>>>>> On 2015-09-10 06:35:41, Iago Toral Quiroga wrote:
>>>>>>>> From: Samuel Iglesias Gonsalvez <siglesias at igalia.com>
>>>>>>>>
>>>>>>>> They are used to calculate size, base alignment and array stride values
>>>>>>>> for a glsl_type following std430 rules.
>>>>>>>>
>>>>>>>> Signed-off-by: Samuel Iglesias Gonsalvez <siglesias at igalia.com>
>>>>>>>> ---
>>>>>>>>  src/glsl/glsl_types.cpp | 209 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>  src/glsl/glsl_types.h   |  19 +++++
>>>>>>>>  2 files changed, 228 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
>>>>>>>> index 755618a..d97991a 100644
>>>>>>>> --- a/src/glsl/glsl_types.cpp
>>>>>>>> +++ b/src/glsl/glsl_types.cpp
>>>>>>>> @@ -1357,6 +1357,215 @@ glsl_type::std140_size(bool row_major) const
>>>>>>>>     return -1;
>>>>>>>>  }
>>>>>>>>
>>>>>>>> +unsigned
>>>>>>>> +glsl_type::std430_base_alignment(bool row_major) const
>>>>>>>> +{
>>>>>>>> +
>>>>>>>> +   unsigned N = is_double() ? 8 : 4;
>>>>>>>> +
>>>>>>>> +   /* (1) If the member is a scalar consuming <N> basic machine units, the
>>>>>>>> +    *     base alignment is <N>.
>>>>>>>> +    *
>>>>>>>> +    * (2) If the member is a two- or four-component vector with components
>>>>>>>> +    *     consuming <N> basic machine units, the base alignment is 2<N> or
>>>>>>>> +    *     4<N>, respectively.
>>>>>>>> +    *
>>>>>>>> +    * (3) If the member is a three-component vector with components consuming
>>>>>>>> +    *     <N> basic machine units, the base alignment is 4<N>.
>>>>>>>> +    */
>>>>>>>> +   if (this->is_scalar() || this->is_vector()) {
>>>>>>>> +      switch (this->vector_elements) {
>>>>>>>> +      case 1:
>>>>>>>> +         return N;
>>>>>>>> +      case 2:
>>>>>>>> +         return 2 * N;
>>>>>>>> +      case 3:
>>>>>>>> +      case 4:
>>>>>>>> +         return 4 * N;
>>>>>>>> +      }
>>>>>>>> +   }
>>>>>>>> +
>>>>>>>> +   /* OpenGL 4.30 spec, section 7.6.2.2 "Standard Uniform Block Layout":
>>>>>>>> +    *
>>>>>>>> +    * "When using the "std430" storage layout, shader storage
>>>>>>>> +    * blocks will be laid out in buffer storage identically to uniform and
>>>>>>>> +    * shader storage blocks using the "std140" layout, except that the base
>>>>>>>> +    * alignment of arrays of scalars and vectors in rule (4) and of structures
>>>>>>>
>>>>>>> Looking at the 4.3 spec (and 4.5), it actually adds "and stride"
>>>>>>> following "base alignment". The extension spec *does not* have the
>>>>>>> "and stride" text.
>>>>>>>
>>>>>>
>>>>>> OK. If you agree, I will keep OpenGL 4.3 (and later) spec wording in all
>>>>>> the places where this snippet is pasted.
>>>>>>
>>>>>>> This seems to be an inconsistency between the extension spec and the
>>>>>>> actual spec, but the OpenGL spec form would produce more tightly
>>>>>>> packed arrays.
>>>>>>>
>>>>>>> Maybe we want to confirm what another implementation does?
>>>>>>
>>>>>> Both NVIDIA and ATI proprietary drivers don't round up the stride of
>>>>>> arrays of vectors to a multiple of a vec4 size, i.e., they are following
>>>>>> the OpenGL spec. For example: for an array of vec2, they are returning
>>>>>> an stride value of 8, not 16 as in std140.
>>>>>
>>>>> Well, my concern was that the 'and stride' part might mean that vec3
>>>>> array stride should be 12 rather than 16. But, I tested NVidia, and
>>>>> they seem to use a stride of 16 for a vec3 array. So, I think your
>>>>> interpretation is correct.
>>>>>
>>>>> I still say we could still use an update to idr's ubo-lolz branch to
>>>>> handle ssbo and std430, but this would also involve extending shader
>>>>> runner to better support ssbo.
>>>>>
>>>>
>>>> I have already done that work. I have a ubo-lolz modified branch [0]
>>>> with an initial support of SSBOs and std430.
>>>>
>>>> About ssbo support for shader_runner, I have sent a couple of patches to
>>>> piglit [1] and I plan to send a new version of them today with a generic
>>>> approach (so it is not only for SSBOs but for other interface types
>>>> defined in ARB_program_interface_query extension).
>>>>
>>>> FWIW, I executed [0] with no errors during 15 minutes.
>>>
>>> As way of validation, have you tried running your modified script
>>> against any other drivers? They may well have bugs in them as well,
>>> but it should be possible to determine if the bug is in the script or
>>> the other impl, should they not match up.
>>>
>>>   -ilia
>>>
>>
>> I tested it on NVIDIA proprietary driver version 352.21. It has an issue
>> when we query shader storage block members when they are arrays of
>> structs and the index is different than zero -> it doesn't find them as
>> active. For example:
>>
>> struct B {
>>         vec4 a[2];
>> }
>>
>> layout(std430) buffer Block {
>>         B[2] s;
>>         vec3 v;
>> };
>>
>> NVIDIA marked v, s[0].a[0] and s[0].a[1] as active but s[1].a[0] and
>> s[0].a[1] as inactive even when they are referenced in main().
>>

... but s[1].a[0] and s[*1*].a[1] as inactive ...

>> I have not checked yet what ARB_program_interface_query spec says
>> regarding to that to see if it is a bug in our driver or in NVIDIA's.
>>
>> However, I modified by hand the failed autogenerated tests removing the
>> queries to those "inactive" variables. As we are interested in offsets,
>> arrays strides and so on, polling index zero of all the top level array
>> of struct members is enough. With that modifications, those tests pass.
>>
>> I plan to try to adapt my script to NVIDIA behavior so I can validate it
>> properly. However, these preliminary results indicate that I don't have
>> made any big mistake.
> 
> From the spec:
> 
>       * For an active variable declared as an array of basic types, a single
>         entry will be generated, with its name string formed by concatenating
>         the name of the array and the string "[0]".
> 
>       * For an active shader storage block member declared as an array, an
>         entry will be generated only for the first array element, regardless
>         of its type.  For arrays of aggregate types, the enumeration rules are
>         applied recursively for the single enumerated array element.
> 
> I find this stuff very difficult to interpret properly, TBH, but I
> think the above may be relevant.
> 
>   -ilia
> 

OK, I will take a look at it as it mentions explicitly active shader
storage block members declared as an array. It is possible that we have
a bug in our code.

Thanks,

Sam