[Mesa-dev] [PATCH] r600g: Fix UMAD on Cayman
Vadim Girlin
vadimgirlin at gmail.com
Wed Apr 10 14:42:18 PDT 2013
On 04/10/2013 01:53 PM, Marek Olšák wrote:
> glsl-fs-loop-nested passes here.
>
> nstack is 3 and adding 4 to it doesn't help.
Ok, thanks.
Also I wrote a simple test app that should reproduce the issue if it's
really related to diverging control flow with nested loops and might
more information about what's going wrong.
The source is in the attachment and needs to be compiled with -lGL
-lglut -lGLEW. The app renders four points and computes some value for
each point in the loops similar to the transform feedback order test,
but it doesn't use tfb. It should render four green or red squares
depending on correctness of the result.
Here is the correct output produced for me on evergreen:
thread 0 (0, 0): expected = 16608, observed = 16608, OK
thread 1 (1, 0): expected = 27873, observed = 27873, OK
thread 2 (0, 1): expected = 16608, observed = 16608, OK
thread 3 (1, 1): expected = 27877, observed = 27877, OK
Please post the output if it fails on cayman.
Vadim
>
> Marek
>
>
> On Wed, Apr 10, 2013 at 8:46 AM, Vadim Girlin <vadimgirlin at gmail.com> wrote:
>
>> On 04/10/2013 03:58 AM, Marek Olšák wrote:
>>
>>> Hi Vadim,
>>>
>>> your patch does not fix the test.
>>>
>>
>> Hmm, I'm out of ideas then. Thanks for testing.
>>
>> I've checked the shader dump few times but I don't see anything obviously
>> wrong there, and the same code (except the minor ALU grouping changes due
>> to the VLIW4/VLIW5 difference) works fine for me on evergreen.
>>
>> According to the Martin's observations it looks like if the threads that
>> shouldn't execute the loop body were incorrectly left in the active state.
>> LOOP_BREAK should put them into the inactive-break state, but something
>> goes wrong. Do the other piglit tests with nested loops (e.g.
>> glsl-fs-loop-nested) work on cayman? Though possibly there are no other
>> tests with the diverging loops as in this case.
>>
>> I'll try to write a simpler test with the diverging loops to see if the
>> issue is really caused by the incorrect control flow handling, and to
>> figure out the exact instruction that results in the incorrect active state.
>>
>> Also probably it worth checking if the stack size is correct for that
>> shader (latest mesa should print nstack value in the shader disassemble
>> header, I think it should be 3 for that shader) and maybe try adding some
>> constant, e.g. 4 to the bc->nstack in the r600_bytecode_build just to be
>> sure that we reserve enough of stack space, though I don't think stack size
>> is the cause of this issue.
>>
>> Vadim
>>
>>
>>
>>> Marek
>>>
>>>
>>> On Tue, Apr 9, 2013 at 11:30 PM, Vadim Girlin <vadimgirlin at gmail.com>
>>> wrote:
>>>
>>> On 04/09/2013 10:58 AM, Martin Andersson wrote:
>>>>
>>>> On Tue, Apr 9, 2013 at 3:18 AM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>
>>>>> Pushed, thanks. The transform feedback test still doesn't pass, but at
>>>>>> least
>>>>>> the hardlocks are gone.
>>>>>>
>>>>>>
>>>>> Thanks, I have looked into the other issue as well
>>>>> http://lists.freedesktop.org/****archives/mesa-dev/2013-March/**
>>>>> **036941.html<http://lists.freedesktop.org/**archives/mesa-dev/2013-March/**036941.html>
>>>>> <http://lists.**freedesktop.org/archives/mesa-**
>>>>> dev/2013-March/036941.html<http://lists.freedesktop.org/archives/mesa-dev/2013-March/036941.html>
>>>>>>
>>>>>
>>>>>
>>>>> The problem arises when there are nested loops. If I rework the code
>>>>> so there are
>>>>> no nested loops the issue disappears. At least one pixel also needs to
>>>>> enter the
>>>>> outer loop. The pixels that should enter the outer loop behaves
>>>>> correctly. It is those
>>>>> pixels that should not enter the outer loop that misbehaves. It does
>>>>> not matter if they
>>>>> also fails the test for the inner loop, they will still execute the
>>>>> instruction inside. That
>>>>> leads to the strange results for that test.
>>>>>
>>>>>
>>>> Please test the attached patch.
>>>>
>>>> Vadim
>>>>
>>>>
>>>> The strangeness is easier to see if the NUM_POINTS in the
>>>>> ext_transform_feedback/
>>>>> order.c are run with smaller values,like 3, 6 and 9. Disable the code
>>>>> that fail the test
>>>>> and print starting_x, shift_reg_final and iteration_count.
>>>>>
>>>>> Marek, since you implemented transform feedback for r600, do you think
>>>>> the issue
>>>>> is with the tranform feedback code or the shader compiler or some other
>>>>> thing?
>>>>>
>>>>> //Martin
>>>>> ______________________________****_________________
>>>>> mesa-dev mailing list
>>>>> mesa-dev at lists.freedesktop.org
>>>>> http://lists.freedesktop.org/****mailman/listinfo/mesa-dev<http://lists.freedesktop.org/**mailman/listinfo/mesa-dev>
>>>>> <htt**p://lists.freedesktop.org/**mailman/listinfo/mesa-dev<http://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>> ______________________________**_________________
>>>> mesa-dev mailing list
>>>> mesa-dev at lists.freedesktop.org
>>>> http://lists.freedesktop.org/**mailman/listinfo/mesa-dev<http://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
#include <stdio.h>
#include <stdlib.h>
#include <GL/glew.h>
#include <GL/glut.h>
const char *vss =
"#version 130\n"
"in int x, y, ref;"
"flat out int b, fref;"
"void main() {"
" b = 0;"
" int i = 0, j = 0;"
" b |= 32;"
" while (true) {"
" b |= 64;"
" if (i >= x) {"
" b |= 128;"
" break;"
// " b |= 256;"
" }"
" b += 1;"
" while (true) {"
" b |= 1024;"
" if (j >= y) {"
" b |= 2048;"
" break;"
// " b |= 4096;"
" }"
" b += 4;"
" ++j;"
" }"
" b |= 8192;"
" ++i;"
" }"
" b |= 16384;"
" fref = ref;"
" gl_Position = vec4(0.2 * x, 0.2 * y, 0.0, 1.0);"
"}";
const char *fss =
"#version 130\n"
"flat in int b, fref;"
"out vec4 color;"
"out int rref;"
"void main() {"
" bool c = (b == fref);"
" color = vec4(c ? 0.0 : 1.0, c ? 1.0 : 0.0, 0.0, 1.0);"
" rref = b;"
"}";
int wsize = 400;
int num_x = 2;
int num_y = 2;
int num_points;
int num_attr = 3;
int *vb;
GLuint prog;
GLuint fbo;
GLuint rb[2];
void check_shader(GLint sh) {
int p;
glGetShaderiv(sh, GL_COMPILE_STATUS, &p);
printf("shader compilation status: %s\n", p == GL_TRUE ? "OK" : "FAIL");
if (p == GL_FALSE) {
char buf[512];
glGetShaderInfoLog(sh, 512, NULL, buf);
printf("info log:\n%s\n", buf);
abort();
}
}
int compute_ref(int x, int y) {
int b = 0;
int i = 0, j = 0;
b |= 32;
while (1) {
b |= 64;
if (i >= x) {
b |= 128;
break;
}
b += 1;
while (1) {
b |= 1024;
if (j >= y) {
b |= 2048;
break;
}
b += 4;
++j;
}
b |= 8192;
++i;
}
b |= 16384;
return b;
}
void init() {
int sh, x, y;
glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
prog = glCreateProgram();
printf("creating vs ...\n");
sh = glCreateShader(GL_VERTEX_SHADER);
glShaderSource(sh, 1, &vss, NULL);
glCompileShader(sh);
check_shader(sh);
glAttachShader(prog, sh);
printf("creating fs ...\n");
sh = glCreateShader(GL_FRAGMENT_SHADER);
glShaderSource(sh, 1, &fss, NULL);
glCompileShader(sh);
check_shader(sh);
glAttachShader(prog, sh);
glBindFragDataLocation(prog, 0, "color");
glBindFragDataLocation(prog, 1, "rref");
glLinkProgram(prog);
glUseProgram(prog);
num_points = num_x * num_y;
vb = malloc(num_attr * num_points * sizeof(GLint));
for (y = 0; y < num_y; ++y) {
for (x = 0; x < num_x; ++x) {
int o = (x + y * num_y) * num_attr;
int ref = compute_ref(x, y);
vb[o] = x;
vb[o + 1] = y;
vb[o + 2] = ref;
printf("thread #%u (%u;%u) : ref = %u\n", o / num_attr, x, y, ref);
}
}
GLint x_index = glGetAttribLocation(prog, "x");
GLint y_index = glGetAttribLocation(prog, "y");
GLint ref_index = glGetAttribLocation(prog, "ref");
glVertexAttribIPointer(x_index, 1, GL_INT, num_attr * sizeof(GLint), vb);
glVertexAttribIPointer(y_index, 1, GL_INT, num_attr * sizeof(GLint), vb + 1);
glVertexAttribIPointer(ref_index, 1, GL_INT, num_attr * sizeof(GLint), vb + 2);
glEnableVertexAttribArray(x_index);
glEnableVertexAttribArray(y_index);
glEnableVertexAttribArray(ref_index);
glPointSize(16.0);
glGenFramebuffers(1, &fbo);
glBindFramebuffer(GL_FRAMEBUFFER, fbo);
glGenRenderbuffers(2, rb);
glBindRenderbuffer(GL_RENDERBUFFER, rb[0]);
glRenderbufferStorage(GL_RENDERBUFFER, GL_RGBA8, wsize, wsize);
glBindRenderbuffer(GL_RENDERBUFFER, rb[1]);
glRenderbufferStorage(GL_RENDERBUFFER, GL_R32I, wsize, wsize);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER, rb[0]);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT1, GL_RENDERBUFFER, rb[1]);
}
void check_gl_error() {
if (glGetError() != GL_NO_ERROR) {
printf("GL error\n");
abort();
}
}
void check_points() {
int x, y;
GLint d;
printf("results:\n");
for (y = 0; y < num_y; ++y)
for (x = 0; x < num_x; ++x) {
int px = (1.0f + 0.2f * x) * wsize / 2 - 1 ;
int py = (1.0f + 0.2f * y) * wsize / 2 - 1;
int t = x + y * num_y;
int ref = vb[t * num_attr + 2];
glReadPixels(px, py, 1, 1, GL_RED_INTEGER, GL_INT, &d);
printf(" thread %u (%u, %u): expected = %u, observed = %u, %s\n",
t, x, y, ref, d, ref == d ? "OK" : "FAIL");
}
}
void disp() {
glBindFramebuffer(GL_FRAMEBUFFER, fbo);
check_gl_error();
GLenum dbs[2] = { GL_COLOR_ATTACHMENT0, GL_COLOR_ATTACHMENT1 };
glDrawBuffers(2, dbs);
check_gl_error();
glClear(GL_COLOR_BUFFER_BIT);
check_gl_error();
glDrawArrays(GL_POINTS, 0, num_points);
check_gl_error();
glReadBuffer(GL_COLOR_ATTACHMENT0);
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0);
check_gl_error();
glDrawBuffer(GL_FRONT);
check_gl_error();
glBlitFramebuffer(0, 0, wsize, wsize, 0, 0, wsize, wsize, GL_COLOR_BUFFER_BIT, GL_NEAREST);
check_gl_error();
glutSwapBuffers();
check_gl_error();
glReadBuffer(GL_COLOR_ATTACHMENT1);
check_gl_error();
check_points();
check_gl_error();
}
int main(int argc, char ** argv) {
glutInit(&argc, argv);
glutInitDisplayMode(GLUT_RGB | GLUT_ALPHA);
glutInitWindowSize(wsize, wsize);
glutCreateWindow("cftest");
glutDisplayFunc(disp);
GLenum err = glewInit();
if (GLEW_OK != err)
{
fprintf(stderr, "GLEW init error: %s\n", glewGetErrorString(err));
}
init();
glutMainLoop();
return 0;
}
More information about the mesa-dev
mailing list