[Beignet] clEnqueueNDRangeKernel and kernel completion

Edward Ching edward.k.ching at gmail.com
Tue Jun 11 22:00:32 PDT 2013


Thank you so much for your responses. Here's a simplified test case I used
to verify that clFinish would block correctly for exec completion of
previously submitted kernel. Not sure if attachements can be sent to this
mail list, so I'm cut & pasting the code here (2 files:
~/beignet/utests/compiler_cl_finish.cpp, and ~/beignet/kernels/
test_cl_finsh.cl)

/Ed

1) compiler_cl_finish.cpp

#include "utest_helper.hpp"
#include <sys/time.h>

#define T_GET(t)        gettimeofday(&t, NULL);
#define T_LAPSE(t1, t2) \
  ((t2.tv_sec+t2.tv_usec*0.000001) - (t1.tv_sec+t1.tv_usec*0.000001))

static void compiler_cl_finish(void)
{
  const size_t n = 16*1024*1024;
  struct timeval t1, t2;
  float t_fin, t_map_w_fin,t_map_wo_fin;

  // Setup kernel and buffers
  OCL_CREATE_KERNEL("test_cl_finish");
  OCL_CREATE_BUFFER(buf[0], 0, n * sizeof(int), NULL);
  OCL_CREATE_BUFFER(buf[1], 0, n * sizeof(int), NULL);

  // Run the kernel
  locals[0]  = 64;
  globals[0] = 32 * locals[0];
  OCL_SET_ARG(0, sizeof(cl_mem), &buf[0]);
  OCL_SET_ARG(1, sizeof(cl_mem), &buf[1]);
  OCL_SET_ARG(2, sizeof(int), &n);
  OCL_SET_ARG(3, sizeof(int), &globals[0]);

  // 1st time map after clFinish
  OCL_NDRANGE(1);
  T_GET(t1);
  OCL_FINISH();
  T_GET(t2);
  t_fin = T_LAPSE(t1, t2);

  T_GET(t1);
  OCL_MAP_BUFFER(0);
  T_GET(t2);
  t_map_w_fin = T_LAPSE(t1, t2);

  // 2nd time map without clFinish
  OCL_NDRANGE(1);
  T_GET(t1);
  OCL_MAP_BUFFER(0);
  T_GET(t2);
  t_map_wo_fin = T_LAPSE(t1, t2);

  OCL_ASSERT(t_fin > t_map_w_fin && t_map_wo_fin > t_map_w_fin);
  OCL_UNMAP_BUFFER(0);
}

MAKE_UTEST_FROM_FUNCTION(compiler_cl_finish);

2) test_cl_finish.cl


__kernel void
test_cl_finish(__global int *src, __global int *dst, int n, int num_threads)
{
int tid, pos;
 tid = get_global_id(0);
for (pos=tid; pos < n; pos+=num_threads) {
dst[pos] = src[pos];
}
}


On Wed, Jun 5, 2013 at 10:45 PM, Zou, Nanhai <nanhai.zou at intel.com> wrote:

>  ** **
>
> ** **
>
> *>>**[Gong, Zhigang] *Your finding is correct, current clFinish does
> nothing which is not comply with the spec.  It is on our TODO list.
> Actually, we have more related TODO items. Currently, the
> clEnqueueNDRangeKernel flushes the batchbuffer every time, and thus the
> clFlush is also an empty function. We also need to optimize it to track the
> states and avoid some unnecessary pipe controls for each kernel. But our
> team’s current focus is to implement the missing opencl features, and try
> hard to pass the piglit test. After that, we will turn to these items. And
> as usual, if everyone from the community want to contribute on these items,
> we will be more than happy to review and accept it.****
>
> ** **
>
> Unlike the cl_event and asynchronize things in our plan, this clFinish one
> looks like a bug. ****
>
> Code is there, just not work as expected.****
>
> ** **
>
> Edward, would  you mind provide a simple test case for this?****
>
> ** **
>
> Thanks****
>
> Zou Nanhai****
>
> ** **
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/beignet/attachments/20130611/ce16a875/attachment.html>


More information about the Beignet mailing list