[Mesa-dev] SandyBridge not handling GL_TRIANGLE_STRIP_ADJACENCY with repeating vertex indices correctly

Iago Toral Quiroga itoral at igalia.com
Thu Aug 14 05:28:31 PDT 2014


On mar, 2014-07-29 at 10:12 +0200, Iago Toral Quiroga wrote:
> Hi,
> 
> running the piglit tests on my implementation of geometry shaders for
> Sandy Bridge produces a GPU hang for the following test:
> 
> ./glsl-1.50-geometry-primitive-id-restart GL_TRIANGLE_STRIP_ADJACENCY
> ffs
> 
> That test checks primitive restarts but the hang seems to be unrelated
> to that, since it happens also when primitive restart is not enabled.
> The problem, which only affects GL_TRIANGLE_STRIP_ADJACENCY and no other
> primitive type -with our without adjacency-, is in this loop that the
> test uses to setup the indices for the vertices:
> 
> elements = glMapBuffer(GL_ELEMENT_ARRAY_BUFFER, GL_READ_WRITE);
> num_elements = 0;
> for (i = 1; i <= LONGEST_INPUT_SEQUENCE; i++) {
>    for (j = 0; j < i; j++) {
>       /* Every element that isn't the primitive
>        * restart index can just be element 0, since
>        * we don't care about the actual vertex data.
>        */
>       elements[num_elements++] = 0;
>    }
>    elements[num_elements++] = prim_restart_index;
> }
> glUnmapBuffer(GL_ELEMENT_ARRAY_BUFFER);
> 
> Setting all elements to the same index (0 in this case) is the one thing
> that causes the hang for GL_TRIANGLE_STRIP_ADJACENCY. A simple change
> like this removes the hang:
> -      elements[num_elements++] = 0;
> +      elements[num_elements++] = j != prim_restart_index ? j : j + 1;
> 
> Skimming through the docs I have not seen any references to this being a
> known problem. In fact, I don't see any references to
> GL_TRIANGLE_STRIP_ADJACENCY being special in any way and it seems that
> this is not a problem in IvyBridge, since the test runs correctly there.
> 
> Does this sound like a hardware bug specific to SandyBridge's handling
> of GL_TRIANGLE_STRIP_ADJACENCY or is there something else I should check
> before arriving to that conclusion?
> 
> If it is a hardware bug I guess we want a workaround for it , at least
> to prevent the hang or something but I am not sure what would be the
> best option here, I think the only option for the driver would be to
> explore the list of indices provided when this primitive type is used
> and when we hit this scenario (I'd have to test how many repeating
> indices we need for it to hang), error out and do not execute the
> drawing command or something... any other suggestions? 

This is what I found so far:

1. the problem is specific to glDrawElements. glDrawArrays works well
even if all the vertices used have the same coordinates. To me this
suggests that the problem should not be in our implementation of GS,
since using glDrawArrays or glDrawElements is handled elsewhere and
should be transparent to the implementation of the GS stage.

2. The problem does not happen in all situations, only when we repeat
values in the indices we use with glDrawElements. In particular, I found
that the pattern that leads to the hang seems to be:
  - There are only 8 indices and all of them are the same.
  - There are more than 8 indices and there is at least one subset of 9
consecutive indices where at least 8 indices are the same (they do not
need to be consecutive within the group of 9).

3. The problem is specific to GL_TRIANGLE_STRIP_ADJACENCY. It does not
hang for any other primitive. In fact, other primitives work well and
produce the expected results. I have not seen specific requirements for
this primitive type in the docs that could justify something like this.
Even GL_TRIANGLE_STRIP_ADJACENCY seems to work well except when there
are repeating vertices with that specific pattern in glDrawElements.

4. The problem seems to be independent of the code we generate in the GS
stage, although this should not be surprising considering 1).
Particularly, the hang persists even in the case of an empty main()
function in the geometry shader (where we generate trivial code that of
course works for any other primitive type).

Based on this my conclusion is that this is very likely a hardware
issue. That, or some very obscure problem in the implementation of the
index buffer in gen6 that I have not seen and that only affects
GL_TRIANGLE_STRIP_ADJACENCY for some reason.

At this point I'd like to hear suggestions for things we could try next
to confirm whether this is a hardware problem or a driver problem, or,
if we agree that this is enough evidence that this must be a hardware
problem, how we can limit its impact, starting, probably, by rewriting
the piglit test so that we don't alter its purpose but avoid the hang on
gen6. We should also discuss if there is a way to work around this
problem so that at least developers running into it (as unlikely as that
may be) don't hang their systems.

I am going to be on holidays starting tomorrow and will have difficult
and limited Internet access for the most part, but Samuel (in the CC)
will be available next week to try any suggestions you may have.

Iago



More information about the mesa-dev mailing list