[Mesa-dev] Mesa (master): st/mesa: fix incorrect size of UBO declarations

Wed Jul 2 15:31:59 PDT 2014

On 07/02/2014 07:29 AM, Brian Paul wrote:
> On 07/02/2014 06:49 AM, Brian Paul wrote:
>> On 07/02/2014 06:36 AM, Brian Paul wrote:
>>> On 07/01/2014 08:43 PM, Michel Dänzer wrote:
>>>> On 02.07.2014 00:43, Brian Paul wrote:
>>>>> Module: Mesa
>>>>> Branch: master
>>>>> Commit: f4b0ab7afd83c811329211eae8167c9bf238870c
>>>>> URL:
>>>>> https://urldefense.proofpoint.com/v1/url?u=http://cgit.freedesktop.org/mesa/mesa/commit/?id%3Df4b0ab7afd83c811329211eae8167c9bf238870c&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=lGQMzzTgII0I7jefp2FHq7WtZ%2BTLs8wadB%2BiIj9xpBY%3D%0A&m=X5dPXUSLXb%2B60RW5%2FHCgcIkPXz083MgfZAbiRJFVOxc%3D%0A&s=c9766b9e3797ccad30888c534fae5891f9a7313418a228c81c0528230412cf1c
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Author: Brian Paul <brianp at vmware.com>
>>>>> Date:   Tue Jul  1 08:17:09 2014 -0600
>>>>>
>>>>> st/mesa: fix incorrect size of UBO declarations
>>>>>
>>>>> UniformBufferSize is in bytes so we need to divide by 16 to get the
>>>>> number of constant buffer slots.  Also, the ureg_DECL_constant2D()
>>>>> function takes first..last parameters so we need to subtract one
>>>>> for the last value.
>>>>
>>>> This change broke the GLSL uniform_buffer fs/vs/gs-struct-pad piglit
>>>> tests with radeonsi:
>>>>
>>>> ../../../../src/gallium/auxiliary/gallivm/lp_bld_tgsi.c:306:lp_build_emit_fetch:
>>>>
>>>>
>>>> Assertion `reg->Register.Index <=
>>>> bld_base->info->file_max[reg->Register.File]' failed.
>>>>
>>>> AFAICT reg->Register.Index is 2, and the new code calls
>>>> ureg_DECL_constant2D() with last=1. This is the TGSI code:
>>>>
>>>> FRAG
>>>> PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
>>>> DCL OUT[0], COLOR
>>>> DCL CONST[1][0..1]
>>>> DCL TEMP[0], LOCAL
>>>> IMM[0] UINT32 {0, 16, 32, 0}
>>>>    0: MOV TEMP[0].x, CONST[1][0].xxxx
>>>>    1: MOV TEMP[0].y, CONST[1][1].xxxx
>>>>    2: MOV TEMP[0].z, CONST[1][2].xxxx
>>>>    3: MOV TEMP[0].w, CONST[1][2].xxxx
>>>>    4: MOV OUT[0], TEMP[0]
>>>>    5: END
>>>>
>>>> which results from this GLSL code:
>>>>
>>>> #version 140
>>>>
>>>> struct S1 {
>>>>          float r;
>>>> };
>>>>
>>>> struct S2 {
>>>>          float g;
>>>>          float b;
>>>>          float a;
>>>> };
>>>>
>>>> struct S {
>>>>         S1 s1;
>>>>         S2 s2;
>>>> };
>>>>
>>>> uniform ubo1 {
>>>>          S s;
>>>> };
>>>>
>>>> void main()
>>>> {
>>>>          gl_FragColor = vec4(s.s1.r, s.s2.g, s.s2.b, s.s2.a);
>>>> }
>>>>
>>>>
>>>> Something doesn't seem to add up, but I'm not sure where exactly the
>>>> problem is or how to fix it.
>>>
>>>
>>> I think there's at least a one GLSL compiler bug exposed here.
>>>
>>>  From the code fragments above, the z and w components of TEMP[0] are
>>> getting the same value when they should not.
>>>
>>> I reverted Mesa to before my changes and modified the piglit test to use
>>> different values for the blue/alpha channels and it fails (nvidia
>>> passes).  We were just getting lucky before my changes.
>>>
>>> I suspect the code which is laying out the floats in the UBO is doing
>>> something wrong, and the size of the UBO is miscalculated as a result.
>>>
>>> I'll dig a bit more but I think someone more familiar with the compiler
>>> will need to help.
>>
>> The initial IR looks OK:
>>
>>
>> (loop ...
>>
>>          (assign  (xyzw) (var_ref gl_Position)  (array_ref (var_ref
>> vertex_to_gs) (var_ref i) ) )
>>          (declare (temporary ) vec4 vec_ctor)
>>          (assign  (x) (var_ref vec_ctor)  (record_ref (record_ref
>> (var_ref s)  s1)  r) )
>>          (assign  (y) (var_ref vec_ctor)  (record_ref (record_ref
>> (var_ref s)  s2)  g) )
>>          (assign  (z) (var_ref vec_ctor)  (record_ref (record_ref
>> (var_ref s)  s2)  b) )
>>          (assign  (w) (var_ref vec_ctor)  (record_ref (record_ref
>> (var_ref s)  s2)  a) )
>>          (assign  (xyzw) (var_ref v)  (var_ref vec_ctor) )
>>          (call EmitVertex  ())
>>
>> ... endloop)
>>
>>
>> But later, after linking, the Z and W components of the vector
>> constructor seem to get merged.  Note: there's only 3 ubo_load
>> instructions now instead of four:
>>
>> ...
>>        (assign  (xyzw) (var_ref gl_Position)  (array_ref (var_ref
>> vertex_to_gs) (constant int (0)) ) )
>>        (declare (temporary ) vec4 vec_ctor)
>>        (declare () float cse)
>>        (assign  (x) (var_ref cse)  (expression float ubo_load (constant
>> uint (0)) (constant uint (0)) ) )
>>        (assign  (x) (var_ref vec_ctor)  (var_ref cse) )
>>        (declare () float cse at 3)
>>        (assign  (x) (var_ref cse at 3)  (expression float ubo_load
>> (constant uint (0)) (constant uint (16)) ) )
>>        (assign  (y) (var_ref vec_ctor)  (var_ref cse at 3) )
>>        (declare () float cse at 4)
>>        (assign  (x) (var_ref cse at 4)  (expression float ubo_load
>> (constant uint (0)) (constant uint (32)) ) )
>>        (assign  (z) (var_ref vec_ctor)  (var_ref cse at 4) )
>>        (assign  (w) (var_ref vec_ctor)  (var_ref cse at 4) )
>>        (assign  (xyzw) (var_ref v)  (var_ref vec_ctor) )
>>        (assign  (xyzw) (var_ref v at 5)  (var_ref v) )
>>        (assign  (xyzw) (var_ref gl_Position at 6)  (var_ref gl_Position) )
>>        (emit-vertex (constant int (0)) )
>>
>> I'm not sure I can debug this much further right now.
> 
> I think the problem is in lower_ubo_reference_visitor::handle_rvalue()
> in lower_ubo_reference.cpp
> 
> If I disable the "const_offset = glsl_align(const_offset,
> max_field_align);" near line 231 things seem to work.
> 
> Something is wrong in the logic that does struct/field alignment in that
> function.

Oof.  There are a number of bugs in this area.  I've been looking at one
for the last few days, and I've just about got it resolved.  Basically.

    uniform U {
        layout(row_major) mat3x4 m;
        vec4 v;
    } u;

doesn't layout u.m as row-major.  This only occurs if there is an
instance name. :(

I can dig into this bug once I'm done with the matrix bug.

> Eric wrote the original code, maybe he can take a look?
> 
> I've also been reading the GL_ARB_uniform_buffer_object spec to
> understand the rules of layout "std140".  For a structure it says "the
> base alignmentof the structure is <N>, where <N> is the largest base
> alignment value of any of its members, and rounded up to the base
> alignment of a vec4."
> 
> So it sounds like all structs should start at 16-byte boundaries.  So
> for the piglit test in question, I'd expect the UBO block to be 32
> bytes.  But both Mesa and nvidia compute the UBO size to be 16 bytes.

Hmm... do both drivers give 32 for

    struct S1 { float f; };
    struct S2 { vec4 v; };

    uniform U {
        S1 s1;
        S2 s2;
    };

Also, the ES3.1 and GL4.4 specs have some changes to the std140 rules to
make them more clear.  Do those specs agree with the extension spec?

> -Brian
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev