[Nouveau] Translating tests/trivial/compute.c gallium tests to opencl (input / help wanted)

Hans de Goede hdegoede at redhat.com
Tue Dec 22 03:37:29 PST 2015


Hi All,

I've been working on translating the tests/trivial/compute.c tests
to opencl (for the buffer setup and kernel launch, I'm keeping the compute
kernels in tgsi as an intermediate step).

I've got the test_input_global() test working, see:

https://fedorapeople.org/~jwrdegoede/compute-opencl-tgsi.c

Next I wanted to convert the test_system_values() test and there
I've gotten stuck. It uses a PIPE_BUFFER which it rw-binds as a
resource. Which OpenCL does not allow. The closest thing is a
constant parameter in OpenCL, at first I've tried binding this
as a 2d image, but that leads to the gpu hanging, this is the
code activated with "#define USE_IMAGE_FOR_BUF 1" in the above
file.

So in a 2nd attempt I've hacked clover to bind cl_mem objects passed
in as constants as rw. This is the code when USE_IMAGE_FOR_BUF is not
defined and works.

In a 3th attempt because the hack is not something workable in the long run,
I've tried to rewrite the test to use a global buffer (not sure this is the
best approach other ideas are welcome), leading to this diff:

--- a/src/gallium/tests/trivial/compute.c
+++ b/src/gallium/tests/trivial/compute.c
@@ -431,7 +431,6 @@ static void launch_grid(struct context *ctx, const uint *block_layout,
  static void test_system_values(struct context *ctx)
  {
          const char *src = "COMP\n"
-                "DCL RES[0], BUFFER, RAW, WR\n"
                  "DCL SV[0], BLOCK_ID[0]\n"
                  "DCL SV[1], BLOCK_SIZE[0]\n"
                  "DCL SV[2], GRID_SIZE[0]\n"
@@ -452,13 +451,15 @@ static void test_system_values(struct context *ctx)
@@ -452,13 +451,15 @@ static void test_system_values(struct context *ctx)
                  "  UADD TEMP[0].xy, TEMP[0].xyxy, TEMP[0].zwzw\n"
                  "  UADD TEMP[0].x, TEMP[0].xxxx, TEMP[0].yyyy\n"
                  "  UMUL TEMP[0].x, TEMP[0], IMM[0]\n"
-                "  STORE RES[0].xyzw, TEMP[0], SV[0]\n"
+                "  LOAD TEMP[1].x, RINPUT, IMM[2]\n"
+                "  UADD TEMP[0].x, TEMP[0], TEMP[1]\n"
+                "  STORE RGLOBAL.xyzw, TEMP[0], SV[0]\n"
                  "  UADD TEMP[0].x, TEMP[0], IMM[1]\n"
-                "  STORE RES[0].xyzw, TEMP[0], SV[1]\n"
+                "  STORE RGLOBAL.xyzw, TEMP[0], SV[1]\n"
                  "  UADD TEMP[0].x, TEMP[0], IMM[1]\n"
-                "  STORE RES[0].xyzw, TEMP[0], SV[2]\n"
+                "  STORE RGLOBAL.xyzw, TEMP[0], SV[2]\n"
                  "  UADD TEMP[0].x, TEMP[0], IMM[1]\n"
-                "  STORE RES[0].xyzw, TEMP[0], SV[3]\n"
+                "  STORE RGLOBAL.xyzw, TEMP[0], SV[3]\n"
                  "  RET\n"
                  "ENDSUB\n";
          void init(void *p, int s, int x, int y) {
@@ -485,16 +486,18 @@ static void test_system_values(struct context *ctx)
                          break;
                  }
          }
+        uint32_t input;

          printf("- %s\n", __func__);

-        init_prog(ctx, 0, 0, 0, src, NULL);
+        init_prog(ctx, 0, 0, 4, src, NULL);
          init_tex(ctx, 0, PIPE_BUFFER, true, PIPE_FORMAT_R32_FLOAT,
                   76800, 0, init);
-        init_compute_resources(ctx, (int []) { 0, -1 });
-        launch_grid(ctx, (uint []){4, 3, 5}, (uint []){5, 4, 1}, 0, NULL);
+        init_globals(ctx, (int []){ 0, -1 },
+                     (uint32_t *[]){ &input });
+        launch_grid(ctx, (uint []){4, 3, 5}, (uint []){5, 4, 1}, 0, &input);
          check_tex(ctx, 0, expect, NULL);
-        destroy_compute_resources(ctx);
+        destroy_globals(ctx);
          destroy_tex(ctx);
          destroy_prog(ctx);
  }

Which also behaves weird, the output is:
- test_system_values
(1, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(2, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(3, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(5, 0)[0]: got 0x1/0.000000, expected 0x3/0.000000
(6, 0)[0]: got 0x1/0.000000, expected 0x5/0.000000
(9, 0)[0]: got 0x1/0.000000, expected 0x4/0.000000
(13, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(14, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(15, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(17, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(18, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(19, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(21, 0)[0]: got 0x1/0.000000, expected 0x3/0.000000
(22, 0)[0]: got 0x1/0.000000, expected 0x5/0.000000
(25, 0)[0]: got 0x1/0.000000, expected 0x4/0.000000
(29, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(30, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(31, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(33, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(34, 0)[0]: got 0x1/0.000000, expected 0x0/0.000000
(19200, 1): FAIL (10920)

So it seems that these tgsi ops:

+ " STORE RGLOBAL.xyzw, TEMP[0], SV[1]\n"

Store the right value in RGLOBAL.x but not in
RGLOBAL.yzw for some reason ?

I've tried changing this to:

+ " STORE RGLOBAL.xyzw, TEMP[0], SV[1].xxxx\n"

Which of course is wrong, but it does lead to different
errors, so it is really writing 4 floats (we already
knew really, otherwise we would expect 0xdeadbeaf values),
but for some reason the values is it storing in RGLOBAL.yzw
are different then the ones stored by the old code in
RES[0].yzw?

Regards,

Hans


More information about the Nouveau mailing list