[Mesa-dev] GLES 2.0 varying algorithm implementation

Thu Feb 23 12:12:21 PST 2012

This set of patches implements the minimal packing algorithm required by GLES 2.0 specification
(see http://www.khronos.org/registry/gles/specs/2.0/GLSL_ES_Specification_1.0.17.pdf p111, 
thank to Paul Berry for pointing this).

Currently any vector varying are occupying a full vec4 slot in output buffer. This means that
drivers can only honor their MAX_VARYING_FLOATS if each varying is of vec4 type. The following
piglit test shows this :
http://lists.freedesktop.org/archives/piglit/2012-February/001850.html
For instance, if hardware reports a 32 varyings support, shaders won't compile if there is more than
8 varyings, even if they are floats.

The minimal varying algorithm required by the spec has several interesting properties :
- If there is enough vec4 slot to store varyings, it won't attempt to pack anything (like currently)
- Arrays access are left "untouched", meaning there is no swizzling hack to implement driver side
(also meaning that array of varying size is limited to MAX_VARYING_FLOATS/4).
- If there are only vec4, vec2 and scalar varyings (not in array), MAX_VARYING_FLOATS can always be
honored.

Of course more sophisticated algorithms can be implemented, however reshaping arrays/overlaping register
(vec3/vec3/vec2 inside 2 vec4s) can be costly for simd/vector architectures (and packing algorithms are
NP hard). It might be possible to do optimal packing for scalar architecture, when tgsi opcode support it.

Thank for review.

Regards,
Vincent