[Mesa-dev] Mesa (master): st/mesa: fix incorrect size of UBO declarations
Brian Paul
brianp at vmware.com
Wed Jul 2 16:05:23 PDT 2014
On 07/02/2014 04:31 PM, Ian Romanick wrote:
> On 07/02/2014 07:29 AM, Brian Paul wrote:
>> On 07/02/2014 06:49 AM, Brian Paul wrote:
>>> On 07/02/2014 06:36 AM, Brian Paul wrote:
>>>> On 07/01/2014 08:43 PM, Michel Dänzer wrote:
>>>>> On 02.07.2014 00:43, Brian Paul wrote:
>>>>>> Module: Mesa
>>>>>> Branch: master
>>>>>> Commit: f4b0ab7afd83c811329211eae8167c9bf238870c
>>>>>> URL:
>>>>>> https://urldefense.proofpoint.com/v1/url?u=http://cgit.freedesktop.org/mesa/mesa/commit/?id%3Df4b0ab7afd83c811329211eae8167c9bf238870c&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=lGQMzzTgII0I7jefp2FHq7WtZ%2BTLs8wadB%2BiIj9xpBY%3D%0A&m=X5dPXUSLXb%2B60RW5%2FHCgcIkPXz083MgfZAbiRJFVOxc%3D%0A&s=c9766b9e3797ccad30888c534fae5891f9a7313418a228c81c0528230412cf1c
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Author: Brian Paul <brianp at vmware.com>
>>>>>> Date: Tue Jul 1 08:17:09 2014 -0600
>>>>>>
>>>>>> st/mesa: fix incorrect size of UBO declarations
>>>>>>
>>>>>> UniformBufferSize is in bytes so we need to divide by 16 to get the
>>>>>> number of constant buffer slots. Also, the ureg_DECL_constant2D()
>>>>>> function takes first..last parameters so we need to subtract one
>>>>>> for the last value.
>>>>>
>>>>> This change broke the GLSL uniform_buffer fs/vs/gs-struct-pad piglit
>>>>> tests with radeonsi:
>>>>>
>>>>> ../../../../src/gallium/auxiliary/gallivm/lp_bld_tgsi.c:306:lp_build_emit_fetch:
>>>>>
>>>>>
>>>>> Assertion `reg->Register.Index <=
>>>>> bld_base->info->file_max[reg->Register.File]' failed.
>>>>>
>>>>> AFAICT reg->Register.Index is 2, and the new code calls
>>>>> ureg_DECL_constant2D() with last=1. This is the TGSI code:
>>>>>
>>>>> FRAG
>>>>> PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
>>>>> DCL OUT[0], COLOR
>>>>> DCL CONST[1][0..1]
>>>>> DCL TEMP[0], LOCAL
>>>>> IMM[0] UINT32 {0, 16, 32, 0}
>>>>> 0: MOV TEMP[0].x, CONST[1][0].xxxx
>>>>> 1: MOV TEMP[0].y, CONST[1][1].xxxx
>>>>> 2: MOV TEMP[0].z, CONST[1][2].xxxx
>>>>> 3: MOV TEMP[0].w, CONST[1][2].xxxx
>>>>> 4: MOV OUT[0], TEMP[0]
>>>>> 5: END
>>>>>
>>>>> which results from this GLSL code:
>>>>>
>>>>> #version 140
>>>>>
>>>>> struct S1 {
>>>>> float r;
>>>>> };
>>>>>
>>>>> struct S2 {
>>>>> float g;
>>>>> float b;
>>>>> float a;
>>>>> };
>>>>>
>>>>> struct S {
>>>>> S1 s1;
>>>>> S2 s2;
>>>>> };
>>>>>
>>>>> uniform ubo1 {
>>>>> S s;
>>>>> };
>>>>>
>>>>> void main()
>>>>> {
>>>>> gl_FragColor = vec4(s.s1.r, s.s2.g, s.s2.b, s.s2.a);
>>>>> }
>>>>>
>>>>>
>>>>> Something doesn't seem to add up, but I'm not sure where exactly the
>>>>> problem is or how to fix it.
>>>>
>>>>
>>>> I think there's at least a one GLSL compiler bug exposed here.
>>>>
>>>> From the code fragments above, the z and w components of TEMP[0] are
>>>> getting the same value when they should not.
>>>>
>>>> I reverted Mesa to before my changes and modified the piglit test to use
>>>> different values for the blue/alpha channels and it fails (nvidia
>>>> passes). We were just getting lucky before my changes.
>>>>
>>>> I suspect the code which is laying out the floats in the UBO is doing
>>>> something wrong, and the size of the UBO is miscalculated as a result.
>>>>
>>>> I'll dig a bit more but I think someone more familiar with the compiler
>>>> will need to help.
>>>
>>> The initial IR looks OK:
>>>
>>>
>>> (loop ...
>>>
>>> (assign (xyzw) (var_ref gl_Position) (array_ref (var_ref
>>> vertex_to_gs) (var_ref i) ) )
>>> (declare (temporary ) vec4 vec_ctor)
>>> (assign (x) (var_ref vec_ctor) (record_ref (record_ref
>>> (var_ref s) s1) r) )
>>> (assign (y) (var_ref vec_ctor) (record_ref (record_ref
>>> (var_ref s) s2) g) )
>>> (assign (z) (var_ref vec_ctor) (record_ref (record_ref
>>> (var_ref s) s2) b) )
>>> (assign (w) (var_ref vec_ctor) (record_ref (record_ref
>>> (var_ref s) s2) a) )
>>> (assign (xyzw) (var_ref v) (var_ref vec_ctor) )
>>> (call EmitVertex ())
>>>
>>> ... endloop)
>>>
>>>
>>> But later, after linking, the Z and W components of the vector
>>> constructor seem to get merged. Note: there's only 3 ubo_load
>>> instructions now instead of four:
>>>
>>> ...
>>> (assign (xyzw) (var_ref gl_Position) (array_ref (var_ref
>>> vertex_to_gs) (constant int (0)) ) )
>>> (declare (temporary ) vec4 vec_ctor)
>>> (declare () float cse)
>>> (assign (x) (var_ref cse) (expression float ubo_load (constant
>>> uint (0)) (constant uint (0)) ) )
>>> (assign (x) (var_ref vec_ctor) (var_ref cse) )
>>> (declare () float cse at 3)
>>> (assign (x) (var_ref cse at 3) (expression float ubo_load
>>> (constant uint (0)) (constant uint (16)) ) )
>>> (assign (y) (var_ref vec_ctor) (var_ref cse at 3) )
>>> (declare () float cse at 4)
>>> (assign (x) (var_ref cse at 4) (expression float ubo_load
>>> (constant uint (0)) (constant uint (32)) ) )
>>> (assign (z) (var_ref vec_ctor) (var_ref cse at 4) )
>>> (assign (w) (var_ref vec_ctor) (var_ref cse at 4) )
>>> (assign (xyzw) (var_ref v) (var_ref vec_ctor) )
>>> (assign (xyzw) (var_ref v at 5) (var_ref v) )
>>> (assign (xyzw) (var_ref gl_Position at 6) (var_ref gl_Position) )
>>> (emit-vertex (constant int (0)) )
>>>
>>> I'm not sure I can debug this much further right now.
>>
>> I think the problem is in lower_ubo_reference_visitor::handle_rvalue()
>> in lower_ubo_reference.cpp
>>
>> If I disable the "const_offset = glsl_align(const_offset,
>> max_field_align);" near line 231 things seem to work.
>>
>> Something is wrong in the logic that does struct/field alignment in that
>> function.
>
> Oof. There are a number of bugs in this area. I've been looking at one
> for the last few days, and I've just about got it resolved. Basically.
>
> uniform U {
> layout(row_major) mat3x4 m;
> vec4 v;
> } u;
>
> doesn't layout u.m as row-major. This only occurs if there is an
> instance name. :(
>
> I can dig into this bug once I'm done with the matrix bug.
>
>> Eric wrote the original code, maybe he can take a look?
>>
>> I've also been reading the GL_ARB_uniform_buffer_object spec to
>> understand the rules of layout "std140". For a structure it says "the
>> base alignmentof the structure is <N>, where <N> is the largest base
>> alignment value of any of its members, and rounded up to the base
>> alignment of a vec4."
>>
>> So it sounds like all structs should start at 16-byte boundaries. So
>> for the piglit test in question, I'd expect the UBO block to be 32
>> bytes. But both Mesa and nvidia compute the UBO size to be 16 bytes.
Actually, I had a mistake there. Mesa reports 32 while nvidia reports
16 bytes.
>
> Hmm... do both drivers give 32 for
>
> struct S1 { float f; };
> struct S2 { vec4 v; };
>
> uniform U {
> S1 s1;
> S2 s2;
> };
Mesa: 32
nvidia: 32
>
> Also, the ES3.1 and GL4.4 specs have some changes to the std140 rules to
> make them more clear. Do those specs agree with the extension spec?
I don't have time to dig into that right now.
-Brian
More information about the mesa-dev
mailing list