[Mesa-dev] [PATCH v2 11/15] glsl/linker: dvec3/dvec4 may consume twice input vertex attributes

Thu May 12 22:42:23 UTC 2016

On Thursday, May 12, 2016 8:28:17 PM PDT Antia Puentes wrote:
> From: "Juan A. Suarez Romero" <jasuarez at igalia.com>
> 
> From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes):
> 
> "A program with more than the value of MAX_VERTEX_ATTRIBS
> active attribute variables may fail to link, unless
> device-dependent optimizations are able to make the program
> fit within available hardware resources. For the purposes
> of this test, attribute variables of the type dvec3, dvec4,
> dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
> count as consuming twice as many attributes as equivalent
> single-precision types. While these types use the same number
> of generic attributes as their single-precision equivalents,
> implementations are permitted to consume two single-precision
> vectors of internal storage for each three- or four-component
> double-precision vector."
> 
> This commits adds a flag that allows driver to specify if dvec3, dvec4,
> dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3 and dmat4 count as consuming
> twice as many attributes as equivalent single-precision types (default
> value being false).
> ---
>  src/compiler/glsl/linker.cpp | 72 ++++++++++++++++++++++++++++++
+-------------
>  src/mesa/main/context.c      |  2 ++
>  src/mesa/main/mtypes.h       | 13 ++++++++
>  3 files changed, 66 insertions(+), 21 deletions(-)
> 
> diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
> index 0268b74..ffec007 100644
> --- a/src/compiler/glsl/linker.cpp
> +++ b/src/compiler/glsl/linker.cpp
> @@ -2434,6 +2434,37 @@ resize_tes_inputs(struct gl_context *ctx,
>  }
>  
>  /**
> + * From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes):
> + *
> + * "A program with more than the value of MAX_VERTEX_ATTRIBS
> + *  active attribute variables may fail to link, unless
> + *  device-dependent optimizations are able to make the program
> + *  fit within available hardware resources. For the purposes
> + *  of this test, attribute variables of the type dvec3, dvec4,
> + *  dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
> + *  count as consuming twice as many attributes as equivalent
> + *  single-precision types. While these types use the same number
> + *  of generic attributes as their single-precision equivalents,
> + *  implementations are permitted to consume two single-precision
> + *  vectors of internal storage for each three- or four-component
> + *  double-precision vector."
> + *
> + * Returns true if three- or four-component double-precision vector 
consumes
> + * two single-precision vectors of internal storage
> + */
> +
> +static inline bool
> +attribute_consumes_two_locations(struct gl_constants *constants,
> +                                 ir_variable *var)
> +{
> +   if (var->type->without_array()->is_dual_slot_double() &&
> +       constants->FP64Vector34Consumes2Locations)
> +      return true;
> +   else
> +      return false;
> +}
> +
> +/**
>   * Find a contiguous set of available bits in a bitmask.
>   *
>   * \param used_mask     Bits representing used (1) and unused (0) locations
> @@ -2725,27 +2756,7 @@ assign_attribute_or_color_locations(gl_shader_program 
*prog,
>  
>  	    used_locations |= (use_mask << attr);
>  
> -            /* From the GL 4.5 core spec, section 11.1.1 (Vertex 
Attributes):
> -             *
> -             * "A program with more than the value of MAX_VERTEX_ATTRIBS
> -             *  active attribute variables may fail to link, unless
> -             *  device-dependent optimizations are able to make the program
> -             *  fit within available hardware resources. For the purposes
> -             *  of this test, attribute variables of the type dvec3, dvec4,
> -             *  dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
> -             *  count as consuming twice as many attributes as equivalent
> -             *  single-precision types. While these types use the same 
number
> -             *  of generic attributes as their single-precision 
equivalents,
> -             *  implementations are permitted to consume two single-
precision
> -             *  vectors of internal storage for each three- or four-
component
> -             *  double-precision vector."
> -             *
> -             * Mark this attribute slot as taking up twice as much space
> -             * so we can count it properly against limits.  According to
> -             * issue (3) of the GL_ARB_vertex_attrib_64bit behavior, this
> -             * is optional behavior, but it seems preferable.
> -             */
> -            if (var->type->without_array()->is_dual_slot_double())
> +            if (attribute_consumes_two_locations(constants, var))
>                 double_storage_locations |= (use_mask << attr);
>  	 }
>  
> @@ -2818,6 +2829,25 @@ assign_attribute_or_color_locations(gl_shader_program 
*prog,
>        to_assign[i].var->data.location = generic_base + location;
>        to_assign[i].var->data.is_unmatched_generic_inout = 0;
>        used_locations |= (use_mask << location);
> +
> +      if (attribute_consumes_two_locations(constants, to_assign[i].var))
> +         double_storage_locations |= (use_mask << location);
> +   }
> +
> +   /* Now that we have all the locations, take in account that dvec3/4 can
> +    * require twice the space of single-precision vectors. Check if we run 
out
> +    * of attribute slots.
> +    */
> +   if (target_index == MESA_SHADER_VERTEX) {
> +      unsigned total_attribs_size =
> +         _mesa_bitcount(used_locations & ((1 << max_index) - 1)) +
> +         _mesa_bitcount(double_storage_locations);
> +      if (total_attribs_size > max_index) {
> +	 linker_error(prog,
> +		      "attempt to use %d vertex attribute slots only %d available ",
> +		      total_attribs_size, max_index);
> +	 return false;

I'm a bit confused - it looks like we already do this check slightly
earlier in the function.  Why do we need to do it again (or later?)?

I also agree with Dave - it looks like Gallium drivers are already
double couting everything.  And i965 wants to double count things.
So...there's probably not a ton of point in adding a flag.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160512/10698eb3/attachment.sig>