[Beignet] [PATCH] Change global barrier test temporary to check local barrier whether need fence.

Zhigang Gong zhigang.gong at linux.intel.com
Mon Jun 17 23:20:41 PDT 2013


On Tue, Jun 18, 2013 at 11:24:12AM +0800, Yang Rong wrote:
> 
> Signed-off-by: Yang Rong <rong.r.yang at intel.com>
> ---
>  kernels/compiler_global_memory_barrier.cl |    4 ++--
>  utests/compiler_global_memory_barrier.cpp |    9 +++++----
>  2 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/kernels/compiler_global_memory_barrier.cl b/kernels/compiler_global_memory_barrier.cl
> index 99bb940..705517d 100644
> --- a/kernels/compiler_global_memory_barrier.cl
> +++ b/kernels/compiler_global_memory_barrier.cl
> @@ -1,7 +1,7 @@
> -__kernel void compiler_global_memory_barrier(__global int *dst, __global int *src) {
> +__kernel void compiler_global_memory_barrier(__global int *dst, __local int *src) {
>    src[get_local_size(0) * (0) + get_local_id(0)] = get_local_id(0);
>    src[get_local_size(0) * (1) + get_local_id(0)] = get_local_id(0);
> -  barrier(CLK_GLOBAL_MEM_FENCE);
> +  barrier(CLK_LOCAL_MEM_FENCE);
>    dst[get_local_size(0) * (2 * get_group_id(0)) + get_local_id(0)] = src[get_local_size(0) * 2 * 0 + get_local_size(0) - (get_local_id(0) + 1)];
>    dst[get_local_size(0) * (2 * get_group_id(0) + 1) + get_local_id(0)] = src[get_local_size(0) * (2 * 0 + 1) + get_local_sizOAe(0) - (get_local_id(0) + 1)];
>  }
> diff --git a/utests/compiler_global_memory_barrier.cpp b/utests/compiler_global_memory_barrier.cpp
> index a6496a7..311d093 100644
> --- a/utests/compiler_global_memory_barrier.cpp
> +++ b/utests/compiler_global_memory_barrier.cpp
> @@ -4,16 +4,17 @@ static void compiler_global_memory_barrier(void)
>  {
>    const size_t n = 16*1024;
>  
> +  globals[0] = n/2;
> +  locals[0] = 32;
> +
>    // Setup kernel and buffers
>    OCL_CREATE_KERNEL("compiler_global_memory_barrier");
>    OCL_CREATE_BUFFER(buf[0], 0, n * sizeof(uint32_t), NULL);
> -  OCL_CREATE_BUFFER(buf[1], 0, n * sizeof(uint32_t), NULL);
> +  //OCL_CREATE_BUFFER(buf[1], 0, n * sizeof(uint32_t), NULL);
>    OCL_SET_ARG(0, sizeof(cl_mem), &buf[0]);
> -  OCL_SET_ARG(1, sizeof(cl_mem), &buf[1]);
> +  OCL_SET_ARG(1, locals[0] * 2 * sizeof(uint32_t), NULL);
It's ok to only allocate SLM memory for one work group, but you also need to
modify the kernel to not exceed the SLM memory src's boundary. And should use
the following kernel rather than the one in your first email. Right?

    src[get_local_id(0)] = get_local_id(0);
    src[get_local_size(0) + get_local_id(0)] = get_local_id(0);
    barrier(CLK_LOCAL_MEM_FENCE);
    dst[get_local_size(0) * (2 * get_group_id(0)) + get_local_id(0)] = src[get_local_size(0) - (get_local_id(0) + 1)];
    dst[get_local_size(0) * (2 * get_group_id(0) + 1) + get_local_id(0)] = src[get_local_size(0) + get_local_size(0) - (get_local_id(0) + 1)];


But, I tested it with this fixed version, I still met failure randomly. Really
don't know the reason. You may have a try at your machine.

>  
>    // Run the kernel
> -  globals[0] = n/2;
> -  locals[0] = 32;
>    OCL_NDRANGE(1);
>    OCL_MAP_BUFFER(0);
>  
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet


More information about the Beignet mailing list