[Mesa-dev] [PATCH 2/2] glsl: Fix counting of varyings.
Jose Fonseca
jfonseca at vmware.com
Tue Jun 23 07:36:25 PDT 2015
On 22/06/15 17:14, Ian Romanick wrote:
> On 06/19/2015 06:08 AM, Jose Fonseca wrote:
>> When input and output varyings started to be counted separately (commit
>> 42305fb5) the is_varying_var function wasn't updated to return true for
>> output varyings or input varyings for stages other than the fragment
>> shader), effectively making the varying limit to never be checked.
>
> Without SSO, counting the varying inputs used by, say, the fragment
> shader, should be sufficient. With SSO, it's more difficult.
>
>> With this change, color, texture coord, and generic varyings are not
>> counted, but others are ignored. It is assumed the hardware will handle
>> special varyings internally (ie, optimistic rather than pessimistic), to
>> avoid causing regressions where things were working somehow.
>>
>> This fixes `glsl-max-varyings --exceed-limits` with softpipe/llvmpipe,
>> which was asserting because we were getting varyings beyond
>> VARYING_SLOT_MAX in st_glsl_to_tgsi.cpp.
>>
>> It also prevents the assertion failure with
>> https://bugs.freedesktop.org/show_bug.cgi?id=90539 but the tests still
>> fails due to the link error.
>>
>> This change also adds a few assertions to catch this sort of errors
>> earlier, and potentially prevent buffer overflows in the future (no
>> buffer overflow was detected here though).
>>
>> However, this change causes several tests to regress:
>>
>> spec/glsl-1.10/execution/varying-packing/simple ivec3 array
>> spec/glsl-1.10/execution/varying-packing/simple ivec3 separate
>> spec/glsl-1.10/execution/varying-packing/simple uvec3 array
>> spec/glsl-1.10/execution/varying-packing/simple uvec3 separate
>
> Wait... so the ivec3 and uvec3 tests fail, but the vec3 test passes?
Correct. This is partial diff from vec3 vs ivec3's GLSL:
: GLSL source for vertex shader 1:
-: #version 110
-varying vec3 var000[42];
-varying float var001;
-varying float var002;
+: #version 130
+flat out ivec3 var000[42];
+out float var001;
+out float var002;
uniform int i;
And it looks like the varying packer refuses the pack together variables
of different types.
Not sure this is a bug in the test, or a limitation in the varying
packing pass. Either way, it's a bug that was being hidden and needs to
be addressed.
>> spec/arb_gpu_shader_fp64/varying-packing/simple dmat3 array
>> spec/glsl-1.50/execution/geometry/max-input-components
>> spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec4-index-rd
>> spec/glsl-1.50/execution/variable-indexing/vs-output-array-vec4-index-wr-before-gs
>>
>> But this all seem to be issues either in the way we count varyings
>> (e.g., geometry inputs get counted multiple times) or in the tests
>> themselves, or limitations in the varying packer, and deserve attention
>> on their own right.
>
> Do you have a feeling for which tests are which sorts of problems?
Only a rough idea:
- The "varying-packing/simple" failures look all similar in nature to
what I described above, ie., int, uint, or doubles not being packed with
floats.
- the geometry related ones are because the code to count GS varyings
over-estimates the varyings (it counts the varyings for the whole
primitive, not just a single vertex)
but I workaround this for now in my change, by returning 0 for GS
(ie, no change for GS).
>
> I'd like to run this through GLES3 conformance before it gets pushed.
> I'm not too worried about the geometry shader issues, but the ivec /
> uvec tests seem more problematic.
Sure.
>
>> ---
>> src/glsl/link_varyings.cpp | 70 ++++++++++++++++++++++++------
>> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 +
>> 2 files changed, 58 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
>> index 278a778..7649720 100644
>> --- a/src/glsl/link_varyings.cpp
>> +++ b/src/glsl/link_varyings.cpp
>> @@ -190,6 +190,8 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog,
>> */
>> const unsigned idx = var->data.location - VARYING_SLOT_VAR0;
>>
>> + assert(idx < MAX_VARYING);
>> +
>> if (explicit_locations[idx] != NULL) {
>> linker_error(prog,
>> "%s shader has multiple outputs explicitly "
>> @@ -1031,25 +1033,63 @@ varying_matches::match_comparator(const void *x_generic, const void *y_generic)
>> /**
>> * Is the given variable a varying variable to be counted against the
>> * limit in ctx->Const.MaxVarying?
>> - * This includes variables such as texcoords, colors and generic
>> - * varyings, but excludes variables such as gl_FrontFacing and gl_FragCoord.
>> + *
>> + * OpenGL specification states:
>
> Please use the canonical format.
>
> * Section A.B (Foo Bar) of the OpenGL X.Y Whichever Profile spec
> * says:
>
> That enables later readers to more easily find the text in the spec.
> Also, the language changes from time to time.
>
>> + *
>> + * Each output variable component used as either a vertex shader output or
>> + * fragment shader input counts against this limit, except for the components
>> + * of gl_Position. A program containing only a vertex and fragment shader
>
> This bit about gl_Position is tricky... I believe this language has
> changed more than once in the spec. It's also the reason the varying
> limit has changed from 64 components to 60 components. I don't think
> that affects this patch... it's just a thing I thought was worth
> pointing out.
>
>> + * that accesses more than this limit's worth of components of outputs may
>> + * fail to link, unless device-dependent optimizations are able to make the
>> + * program fit within available hardware resources.
>> + *
>> */
>> static bool
>> var_counts_against_varying_limit(gl_shader_stage stage, const ir_variable *var)
>> {
>> - /* Only fragment shaders will take a varying variable as an input */
>> - if (stage == MESA_SHADER_FRAGMENT &&
>> - var->data.mode == ir_var_shader_in) {
>> - switch (var->data.location) {
>> - case VARYING_SLOT_POS:
>> - case VARYING_SLOT_FACE:
>> - case VARYING_SLOT_PNTC:
>> - return false;
>> - default:
>> - return true;
>> - }
>> + assert(var->data.mode == ir_var_shader_in || var->data.mode == ir_var_shader_out);
>> +
>> + /* FIXME: It looks like we're currently counting each input multiple times
>> + * on geometry shaders. See piglit
>> + * spec/glsl-1.50/execution/geometry/max-input-components */
>
> The closing */ of a multi-line comment should go on its own line.
>
>> + if (stage == MESA_SHADER_GEOMETRY) {
>> + return false;
>> + }
>> +
>> + switch (var->data.location) {
>> + case VARYING_SLOT_POS:
>> + return false;
>> + case VARYING_SLOT_COL0:
>> + case VARYING_SLOT_COL1:
>> + case VARYING_SLOT_FOGC:
>> + case VARYING_SLOT_TEX0:
>> + case VARYING_SLOT_TEX1:
>> + case VARYING_SLOT_TEX2:
>> + case VARYING_SLOT_TEX3:
>> + case VARYING_SLOT_TEX4:
>> + case VARYING_SLOT_TEX5:
>> + case VARYING_SLOT_TEX6:
>> + case VARYING_SLOT_TEX7:
>> + return true;
>> + case VARYING_SLOT_PSIZ:
>> + case VARYING_SLOT_BFC0:
>> + case VARYING_SLOT_BFC1:
>> + case VARYING_SLOT_EDGE:
>> + case VARYING_SLOT_CLIP_VERTEX:
>> + case VARYING_SLOT_CLIP_DIST0:
>> + case VARYING_SLOT_CLIP_DIST1:
>> + case VARYING_SLOT_PRIMITIVE_ID:
>> + case VARYING_SLOT_LAYER:
>> + case VARYING_SLOT_VIEWPORT:
>> + case VARYING_SLOT_FACE:
>> + case VARYING_SLOT_PNTC:
>
> I'm really unsure about this list. There are a couple of these that I
> would swear are supposed to be counted... and I think some of them
> depend on the stage. For example, the GL_ARB_fragment_layer_viewport
> spec says:
>
> "If a fragment shader contains a static access to gl_ViewportIndex,
> it will count against the implementation defined limit for the
> maximum number of inputs to the fragment stage."
I agree -- this needs further investigation -- hence the "XXX" I just
didn't want to go from one extreme to the other on this change, with
fear of causing apps that were working fine to wrongly fail.
Also, the spec snippet I quoted says, "unless device-dependent
optimizations are able to make the program fit within available hardware
resources".
So, IIUC, any implementation is free to let shaders go through as long
the hardware can cope with it.
What we could do that is do a linker_warning when the varyings can be
handled by Mesa drivers, but could cause other implementations to err.
>> + /* XXX: This is hardware dependent, but we assume that all drivers have
>> + * dedicated slots to dealwith this */
>
> Same comment about */ placement. Also s/dealwith/deal with/.
>
>> + return false;
>> + default:
>> + assert(var->data.location >= VARYING_SLOT_VAR0);
>> + return true;
>> }
>> - return false;
>> }
>>
>>
>> @@ -1163,6 +1203,7 @@ populate_consumer_input_sets(void *mem_ctx, exec_list *ir,
>> * consumer_inputs_with_locations[var->data.location], not any
>> * following entries for the array/structure.
>> */
>> + assert(input_var->data.location < VARYING_SLOT_MAX);
>> consumer_inputs_with_locations[input_var->data.location] =
>> input_var;
>> } else if (input_var->get_interface_type() != NULL) {
>> @@ -1198,6 +1239,7 @@ get_matching_input(void *mem_ctx,
>> ir_variable *input_var;
>>
>> if (output_var->data.explicit_location) {
>> + assert(output_var->data.location < VARYING_SLOT_MAX);
>> input_var = consumer_inputs_with_locations[output_var->data.location];
>> } else if (output_var->get_interface_type() != NULL) {
>> char *const iface_field_name =
>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> index 03834b6..73758af 100644
>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> @@ -2268,6 +2268,7 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
>> * user-defined varyings.
>> */
>> assert(var->data.location != -1);
>> + assert(var->data.location < VARYING_SLOT_MAX);
>>
>> if (is_inout_array(shader->Stage, var, &is_2d)) {
>> struct array_decl *decl = &input_arrays[num_input_arrays];
>> @@ -2294,6 +2295,7 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
>> break;
>> case ir_var_shader_out:
>> assert(var->data.location != -1);
>> + assert(var->data.location < VARYING_SLOT_MAX);
>>
>> if (is_inout_array(shader->Stage, var, &is_2d)) {
>> struct array_decl *decl = &output_arrays[num_output_arrays];
>>
>
Thanks for the review. I'll make the updates and push to a private branch.
I'm afraid I don't have the time or the expertise to dive into these
GLSL issues. But I believe it is important both to give early notice to
apps and protect drivers when hardware limits are exceeded, so I think
this patch is a step on the right direction. Even if it raises more
questions than answers.
For the record, my (practical) goal with this, is just get to a point
where llvmpipe doesn't crash or assert with piglit tests.
Jose
More information about the mesa-dev
mailing list