[Mesa-dev] [PATCH v2 11/15] glsl/linker: dvec3/dvec4 may consume twice input vertex attributes
Antia Puentes
apuentes at igalia.com
Thu May 12 18:28:17 UTC 2016
From: "Juan A. Suarez Romero" <jasuarez at igalia.com>
>From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes):
"A program with more than the value of MAX_VERTEX_ATTRIBS
active attribute variables may fail to link, unless
device-dependent optimizations are able to make the program
fit within available hardware resources. For the purposes
of this test, attribute variables of the type dvec3, dvec4,
dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
count as consuming twice as many attributes as equivalent
single-precision types. While these types use the same number
of generic attributes as their single-precision equivalents,
implementations are permitted to consume two single-precision
vectors of internal storage for each three- or four-component
double-precision vector."
This commits adds a flag that allows driver to specify if dvec3, dvec4,
dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3 and dmat4 count as consuming
twice as many attributes as equivalent single-precision types (default
value being false).
---
src/compiler/glsl/linker.cpp | 72 +++++++++++++++++++++++++++++++-------------
src/mesa/main/context.c | 2 ++
src/mesa/main/mtypes.h | 13 ++++++++
3 files changed, 66 insertions(+), 21 deletions(-)
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 0268b74..ffec007 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -2434,6 +2434,37 @@ resize_tes_inputs(struct gl_context *ctx,
}
/**
+ * From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes):
+ *
+ * "A program with more than the value of MAX_VERTEX_ATTRIBS
+ * active attribute variables may fail to link, unless
+ * device-dependent optimizations are able to make the program
+ * fit within available hardware resources. For the purposes
+ * of this test, attribute variables of the type dvec3, dvec4,
+ * dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
+ * count as consuming twice as many attributes as equivalent
+ * single-precision types. While these types use the same number
+ * of generic attributes as their single-precision equivalents,
+ * implementations are permitted to consume two single-precision
+ * vectors of internal storage for each three- or four-component
+ * double-precision vector."
+ *
+ * Returns true if three- or four-component double-precision vector consumes
+ * two single-precision vectors of internal storage
+ */
+
+static inline bool
+attribute_consumes_two_locations(struct gl_constants *constants,
+ ir_variable *var)
+{
+ if (var->type->without_array()->is_dual_slot_double() &&
+ constants->FP64Vector34Consumes2Locations)
+ return true;
+ else
+ return false;
+}
+
+/**
* Find a contiguous set of available bits in a bitmask.
*
* \param used_mask Bits representing used (1) and unused (0) locations
@@ -2725,27 +2756,7 @@ assign_attribute_or_color_locations(gl_shader_program *prog,
used_locations |= (use_mask << attr);
- /* From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes):
- *
- * "A program with more than the value of MAX_VERTEX_ATTRIBS
- * active attribute variables may fail to link, unless
- * device-dependent optimizations are able to make the program
- * fit within available hardware resources. For the purposes
- * of this test, attribute variables of the type dvec3, dvec4,
- * dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
- * count as consuming twice as many attributes as equivalent
- * single-precision types. While these types use the same number
- * of generic attributes as their single-precision equivalents,
- * implementations are permitted to consume two single-precision
- * vectors of internal storage for each three- or four-component
- * double-precision vector."
- *
- * Mark this attribute slot as taking up twice as much space
- * so we can count it properly against limits. According to
- * issue (3) of the GL_ARB_vertex_attrib_64bit behavior, this
- * is optional behavior, but it seems preferable.
- */
- if (var->type->without_array()->is_dual_slot_double())
+ if (attribute_consumes_two_locations(constants, var))
double_storage_locations |= (use_mask << attr);
}
@@ -2818,6 +2829,25 @@ assign_attribute_or_color_locations(gl_shader_program *prog,
to_assign[i].var->data.location = generic_base + location;
to_assign[i].var->data.is_unmatched_generic_inout = 0;
used_locations |= (use_mask << location);
+
+ if (attribute_consumes_two_locations(constants, to_assign[i].var))
+ double_storage_locations |= (use_mask << location);
+ }
+
+ /* Now that we have all the locations, take in account that dvec3/4 can
+ * require twice the space of single-precision vectors. Check if we run out
+ * of attribute slots.
+ */
+ if (target_index == MESA_SHADER_VERTEX) {
+ unsigned total_attribs_size =
+ _mesa_bitcount(used_locations & ((1 << max_index) - 1)) +
+ _mesa_bitcount(double_storage_locations);
+ if (total_attribs_size > max_index) {
+ linker_error(prog,
+ "attempt to use %d vertex attribute slots only %d available ",
+ total_attribs_size, max_index);
+ return false;
+ }
}
return true;
diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 6af02d1..eccbf24 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -620,6 +620,8 @@ _mesa_init_constants(struct gl_constants *consts, gl_api api)
*/
consts->VertexID_is_zero_based = false;
+ consts->FP64Vector34Consumes2Locations = false;
+
/* GL_ARB_draw_buffers */
consts->MaxDrawBuffers = MAX_DRAW_BUFFERS;
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index d8adf5c..9d1503c 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3504,6 +3504,19 @@ struct gl_constants
bool VertexID_is_zero_based;
/**
+ * GL_ARB_vertex_attrib_64bit:
+ *
+ * "... attribute variables of the type dvec3, dvec4, dmat2x3, dmat2x4,
+ * dmat3, dmat3x4, dmat4x3, and dmat4 may count as consuming twice as many
+ * attributes as equivalent single-precision types. While these types use
+ * the same number of generic attributes as their single-precision
+ * equivalents, implementations are permitted to consume two
+ * single-precision vectors of internal storage for each three- or
+ * four-component double-precision vector."
+ */
+ bool FP64Vector34Consumes2Locations;
+
+ /**
* If the driver supports real 32-bit integers, what integer value should be
* used for boolean true in uniform uploads? (Usually 1 or ~0.)
*/
--
2.7.4
More information about the mesa-dev
mailing list