[Beignet] clEnqueueNDRangeKernel and kernel completion
Edward Ching
edward.k.ching at gmail.com
Tue Jun 11 22:00:32 PDT 2013
Thank you so much for your responses. Here's a simplified test case I used
to verify that clFinish would block correctly for exec completion of
previously submitted kernel. Not sure if attachements can be sent to this
mail list, so I'm cut & pasting the code here (2 files:
~/beignet/utests/compiler_cl_finish.cpp, and ~/beignet/kernels/
test_cl_finsh.cl)
/Ed
1) compiler_cl_finish.cpp
#include "utest_helper.hpp"
#include <sys/time.h>
#define T_GET(t) gettimeofday(&t, NULL);
#define T_LAPSE(t1, t2) \
((t2.tv_sec+t2.tv_usec*0.000001) - (t1.tv_sec+t1.tv_usec*0.000001))
static void compiler_cl_finish(void)
{
const size_t n = 16*1024*1024;
struct timeval t1, t2;
float t_fin, t_map_w_fin,t_map_wo_fin;
// Setup kernel and buffers
OCL_CREATE_KERNEL("test_cl_finish");
OCL_CREATE_BUFFER(buf[0], 0, n * sizeof(int), NULL);
OCL_CREATE_BUFFER(buf[1], 0, n * sizeof(int), NULL);
// Run the kernel
locals[0] = 64;
globals[0] = 32 * locals[0];
OCL_SET_ARG(0, sizeof(cl_mem), &buf[0]);
OCL_SET_ARG(1, sizeof(cl_mem), &buf[1]);
OCL_SET_ARG(2, sizeof(int), &n);
OCL_SET_ARG(3, sizeof(int), &globals[0]);
// 1st time map after clFinish
OCL_NDRANGE(1);
T_GET(t1);
OCL_FINISH();
T_GET(t2);
t_fin = T_LAPSE(t1, t2);
T_GET(t1);
OCL_MAP_BUFFER(0);
T_GET(t2);
t_map_w_fin = T_LAPSE(t1, t2);
// 2nd time map without clFinish
OCL_NDRANGE(1);
T_GET(t1);
OCL_MAP_BUFFER(0);
T_GET(t2);
t_map_wo_fin = T_LAPSE(t1, t2);
OCL_ASSERT(t_fin > t_map_w_fin && t_map_wo_fin > t_map_w_fin);
OCL_UNMAP_BUFFER(0);
}
MAKE_UTEST_FROM_FUNCTION(compiler_cl_finish);
2) test_cl_finish.cl
__kernel void
test_cl_finish(__global int *src, __global int *dst, int n, int num_threads)
{
int tid, pos;
tid = get_global_id(0);
for (pos=tid; pos < n; pos+=num_threads) {
dst[pos] = src[pos];
}
}
On Wed, Jun 5, 2013 at 10:45 PM, Zou, Nanhai <nanhai.zou at intel.com> wrote:
> ** **
>
> ** **
>
> *>>**[Gong, Zhigang] *Your finding is correct, current clFinish does
> nothing which is not comply with the spec. It is on our TODO list.
> Actually, we have more related TODO items. Currently, the
> clEnqueueNDRangeKernel flushes the batchbuffer every time, and thus the
> clFlush is also an empty function. We also need to optimize it to track the
> states and avoid some unnecessary pipe controls for each kernel. But our
> team’s current focus is to implement the missing opencl features, and try
> hard to pass the piglit test. After that, we will turn to these items. And
> as usual, if everyone from the community want to contribute on these items,
> we will be more than happy to review and accept it.****
>
> ** **
>
> Unlike the cl_event and asynchronize things in our plan, this clFinish one
> looks like a bug. ****
>
> Code is there, just not work as expected.****
>
> ** **
>
> Edward, would you mind provide a simple test case for this?****
>
> ** **
>
> Thanks****
>
> Zou Nanhai****
>
> ** **
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/beignet/attachments/20130611/ce16a875/attachment.html>
More information about the Beignet
mailing list