[Beignet] [BUG] piglit test case fail (get-global-size)

Zhigang Gong zhigang.gong at linux.intel.com
Mon Jun 24 23:34:11 PDT 2013


On Mon, Jun 24, 2013 at 01:57:13PM +0800, Zhigang Gong wrote:
Hi Yi,

I just wrote a patch to work around this bug and optimize those
builtin functions. Please help to test it. Thanks.

>From 3b5f96a9f085ce5e865008485dd9c5e6b5245bbd Mon Sep 17 00:00:00 2001
From: Zhigang Gong <zhigang.gong at linux.intel.com>
Date: Tue, 25 Jun 2013 14:15:09 +0800
Subject: [PATCH] Refine the get_local_id/... builtins.

As we could prepare correct value on runtime library side and give
a correct value in the payload for dim 0, 1 and 2. So for these 3
dim argument, we don't need to check it whether in the valid range,
we just read the payload's value.

This way, we can avoid any unecessary branching for normal usage of
these builtin functions. And could avoid a known bool related bug.

Signed-off-by: Zhigang Gong <zhigang.gong at linux.intel.com>
---
 backend/src/ocl_stdlib.h | 8 +++-----
 src/cl_api.c             | 2 +-
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/backend/src/ocl_stdlib.h b/backend/src/ocl_stdlib.h
index 81a0193..f43e898 100644
--- a/backend/src/ocl_stdlib.h
+++ b/backend/src/ocl_stdlib.h
@@ -4315,11 +4315,9 @@ DECL_INTERNAL_WORK_ITEM_FN(get_num_groups)
 #define DECL_PUBLIC_WORK_ITEM_FN(NAME, OTHER_RET)    \
 INLINE unsigned NAME(unsigned int dim) {             \
   if (dim == 0) return __gen_ocl_##NAME##0();        \
-  else if (dim > 0 && dim < get_work_dim()) {        \
-    if (dim == 1) return __gen_ocl_##NAME##1();      \
-    else if (dim == 2) return __gen_ocl_##NAME##2(); \
-  }                                                  \
-  return OTHER_RET;                                  \
+  else if (dim == 1) return __gen_ocl_##NAME##1();   \
+  else if (dim == 2) return __gen_ocl_##NAME##2();   \
+  else return OTHER_RET;                             \
 }
 
 DECL_PUBLIC_WORK_ITEM_FN(get_group_id, 0)
diff --git a/src/cl_api.c b/src/cl_api.c
index 3c78243..ebca294 100644
--- a/src/cl_api.c
+++ b/src/cl_api.c
@@ -1570,7 +1570,7 @@ clEnqueueNDRangeKernel(cl_command_queue  command_queue,
 {
   size_t fixed_global_off[] = {0,0,0};
   size_t fixed_global_sz[] = {1,1,1};
-  size_t fixed_local_sz[] = {16,1,1};
+  size_t fixed_local_sz[] = {1,1,1};
   cl_int err = CL_SUCCESS;
   cl_uint i;
 
-- 
1.7.11.7




> 
> 
> > -----Original Message-----
> > From: beignet-bounces+zhigang.gong=linux.intel.com at lists.freedesktop.org
> >
> [mailto:beignet-bounces+zhigang.gong=linux.intel.com at lists.freedesktop.org]
> > On Behalf Of Boqun Feng
> > Sent: Monday, June 24, 2013 1:30 PM
> > To: Zhigang Gong
> > Cc: 'Sun, Yi'; beignet at lists.freedesktop.org
> > Subject: Re: [Beignet] [BUG] piglit test case fail (get-global-size)
> > 
> > On Mon, Jun 24, 2013 at 12:58:59PM +0800, Zhigang Gong wrote:
> > > The root cause is not in this commit. It seems that we have a bug when
> > > we handle the branching introduced by
> > 
> > the root cause is we can not save a bool variable in a general purpose
> register,
> > so when emit a PHI instruction for i1, the compiler will say that "Boolean
> can
> > not escape from its basic block"
> > 
> [Gong, Zhigang] Right, we may want to fix the bool type related issue
> latter.
> > To avoid PHI for bool, we can use a int to hold the result like this:
> > 
> > 
> > #define DECL_PUBLIC_WORK_ITEM_FN(NAME, OTHER_RET) \
> > INLINE unsigned NAME(unsigned int dim) {          \
> >   unsigned result = OTHER_RET;                    \
> >   if (dim == 0) result = __gen_ocl_##NAME##0();   \
> >   if (dim == 1) result = __gen_ocl_##NAME##1();   \
> >   if (dim == 2) result = __gen_ocl_##NAME##2();   \
> >   if (dim >= get_work_dim()) result = OTHER_RET;  \
> [Gong, Zhigang] should be if (dim >= get_work_dim() || dim < 0)
> >   return result;                                  \
> > }
> [Gong, Zhigang] Please be hold for this specific issue. I will change the
> way to set correct value
> For 0,1,and 2 on run time side, and then we can change the above code to 
> if (dim == 0) ...
> else if (dim == 1) ...
> else if (dim == 2)...
> else result = OTHER_RET;
> 
> This way, for all normal usages, there will not be any real branching in the
> final generated binary code.
> It will only generate branching for those using out-of-renage dim code.
> 
> > 
> > this will emit PHI for i32, and this is fine with gen backend
> > 
> > > the commit as below:
> > >
> > > #define DECL_PUBLIC_WORK_ITEM_FN(NAME, OTHER_RET)    \
> > > INLINE unsigned NAME(unsigned int dim) {             \
> > >   if (dim == 0) return __gen_ocl_##NAME##0();        \
> > >   else if (dim > 0 && dim < get_work_dim()) {        \
> > >     if (dim == 1) return __gen_ocl_##NAME##1();      \
> > >     else if (dim == 2) return __gen_ocl_##NAME##2(); \
> > >   }                                                  \
> > >   return OTHER_RET;                                  \
> > > }
> > >
> > > I will refine the above code to avoid any branching for normal usage
> > > of those built-in functions.
> > >
> > > > -----Original Message-----
> > > > From:
> > > > beignet-bounces+zhigang.gong=linux.intel.com at lists.freedesktop.org
> > > >
> > > [mailto:beignet-bounces+zhigang.gong=linux.intel.com at lists.freedesktop
> > > .org]
> > > > On Behalf Of Boqun Feng
> > > > Sent: Monday, June 24, 2013 12:15 PM
> > > > To: Sun, Yi
> > > > Cc: beignet at lists.freedesktop.org
> > > > Subject: Re: [Beignet] [BUG] piglit test case fail (get-global-size)
> > > >
> > > > I found this bug is introduced by
> > > > commit 37687ac7c7b3b5bf5fb7ceeb9229c503afe74c87
> > > >
> > > > the llvm IR for function get_global_size as follow:
> > > >
> > > >
> > > > get_global_size.exit2:                            ; preds
> > > > = %7, %get_global_size.exit, %0
> > > >   %9 = phi i1 [ true, %7 ], [ false, %get_global_size.exit ], [
> > > > false, %0
> > > ]
> > > >
> > > >
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > > ~~~~~~~~~
> > > >
> > > > this line cause the problem
> > > >
> > > >   %10 = phi i32 [ %5, %7 ], [ %5, %get_global_size.exit ], [ %1, %0 ]
> > > >   %.0.i1 = phi i32 [ %8, %7 ], [ 1, %get_global_size.exit ], [ 1, %0 ]
> > > >   %11 = mul i32 %.0.i1, %10
> > > >   %12 = tail call ptx_device i32 @__gen_ocl_get_local_id0() nounwind
> > > > readnone
> > > >   %13 = tail call ptx_device i32 @__gen_ocl_get_local_size0()
> > > > nounwind readnone
> > > >   %14 = tail call ptx_device i32 @__gen_ocl_get_group_id0() nounwind
> > > > readnone
> > > >   %15 = mul i32 %14, %13
> > > >   %16 = add i32 %15, %12
> > > >   br i1 %3, label %17, label %get_global_size.exit4
> > > >
> > > > ; <label>:17                                      ; preds
> > > > = %get_global_size.exit2
> > > >   %18 = tail call ptx_device i32 @__gen_ocl_get_global_size1()
> > > > nounwind readnone
> > > >   br label %get_global_size.exit4
> > > >
> > > > get_global_size.exit4:                            ; preds
> > > > = %17, %get_global_size.exit2
> > > >   %.0.i3 = phi i32 [ %18, %17 ], [ 1, %get_global_size.exit2 ]
> > > >   br i1 %9, label %19, label %get_global_size.exit6
> > > >
> > > > ; <label>:19                                      ; preds
> > > > = %get_global_size.exit4
> > > >   %20 = tail call ptx_device i32 @__gen_ocl_get_global_size2()
> > > > nounwind readnone
> > > >   br label %get_global_size.exit6
> > > >
> > > > get_global_size.exit6:                            ; preds
> > > > = %19, %get_global_size.exit4
> > > >   %.0.i5 = phi i32 [ %20, %19 ], [ 1, %get_global_size.exit4 ]
> > > >   %21 = mul i32 %.0.i3, %16
> > > >   %22 = mul i32 %21, %.0.i5
> > > >   br i1 %3, label %.thread17.i, label %get_global_id.exit
> > > >
> > > >
> > > >
> > > > On Mon, Jun 24, 2013 at 11:34:22AM +0800, Boqun Feng wrote:
> > > > > Yes, I find the same problem, and I am trying to toggle on
> > > > > OUTPUT_LLVM flag to see the compiler IR
> > > > >
> > > > > On Mon, Jun 24, 2013 at 11:27:31AM +0800, Sun, Yi wrote:
> > > > > > But it's a bit strange that I didn't found any bool variables in
> > > > > > the test case after a quick looking through. The kernel is
> attached.
> > > > > >
> > > > > > Thanks
> > > > > >   --Sun, Yi
> > > > > >
> > > > > > On Mon, 2013-06-24 at 11:20 +0800, Boqun Feng wrote:
> > > > > > > This one is caused by bool variable's limited support of our
> > > > > > > compiler
> > > > > > >
> > > > > > > On Mon, Jun 24, 2013 at 03:15:13AM +0000, Sun, Yi wrote:
> > > > > > > > I'm not sure it is caused by the built-in function
> > > > > > > > get-global-size specified.
> > > > > > > > Anyone has any idea?
> > > > > > > >
> > > > > > > > reproduce step:
> > > > > > > > [piglit]# bin/cl-program-tester
> > > > > > > > tests/cl/program/execute/get-global-size.cl
> > > > > > > >
> > > > > > > >
> > > > > > > > Output:
> > > > > > > > ## Test: get_global_id
> > > > > > > > (/GFX/Test/Piglit/piglit/tests/cl/program/program-tester.c)
> > > > > > > > ##
> > > > > > > >
> > > > > > > > # Platform supporting only version 1.0. Running test on that
> > > version.
> > > > > > > > # Running on:
> > > > > > > > #   Platform: Experiment Intel Gen OCL Driver
> > > > > > > > #   Device: Intel HD Graphics Family
> > > > > > > > #   OpenCL version: 1.1
> > > > > > > > #   OpenCL C version: 1.0
> > > > > > > > #   Build options:
> > > > > > > > ASSERTION FAILED: TODO Boolean values cannot escape their
> > > > > > > > definition basic block
> > > > > > > >   at
> > > > > > > > file
> > > > > > > > /home/works/opencl/beignet/backend/src/llvm/llvm_gen_backend
> > > > > > > > .cpp , function void
> > > > > > > > gbe::GenWriter::emitMovForPHI(llvm::BasicBlock*,
> > > > > > > > llvm::BasicBlock*), line 844 Stack dump:
> > > > > > > > 0.      Running pass 'Function Pass Manager' on module
> > > > > > > > '/tmp/file292g7Y.ll'.
> > > > > > > > 1.      Running pass 'Gen Back-End' on function '@fill3d'
> > > > > > > >
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >   --Sun, Yi
> > > > > > > > _______________________________________________
> > > > > > > > Beignet mailing list
> > > > > > > > Beignet at lists.freedesktop.org
> > > > > > > > http://lists.freedesktop.org/mailman/listinfo/beignet
> > > > > >
> > > > >
> > > > > > /*!
> > > > > > [config]
> > > > > > name: get_global_id
> > > > > > clc_version_min: 10
> > > > > >
> > > > > > [test]
> > > > > > name: 1D, global_size 4 0 0
> > > > > > kernel_name: fill1d
> > > > > > dimensions: 1
> > > > > > global_size: 4 0 0
> > > > > > local_size: 1 0 0
> > > > > > arg_out: 0 buffer int[4] 4 4 4 4
> > > > > >
> > > > > > [test]
> > > > > > name: 1D, global_size 4 0 0, local_size 2 0 0
> > > > > > kernel_name: fill1d
> > > > > > dimensions: 1
> > > > > > global_size: 4 0 0
> > > > > > local_size: 2 0 0
> > > > > > arg_out: 0 buffer int[4] 4 4 4 4
> > > > > >
> > > > > > [test]
> > > > > > name: 1D, global_size 4 0 0, local_size 4 0 0
> > > > > > kernel_name: fill1d
> > > > > > dimensions: 1
> > > > > > global_size: 4 0 0
> > > > > > local_size: 4 0 0
> > > > > > arg_out: 0 buffer int[4] 4 4 4 4
> > > > > >
> > > > > > [test]
> > > > > > name: 2D, global_size 4 4 0, local_size 1 1 0
> > > > > > kernel_name: fill2d
> > > > > > dimensions: 2
> > > > > > global_size: 4 4 0
> > > > > > local_size: 1 1 0
> > > > > > arg_out: 0 buffer int[16] 16 16 16 16 16 16 16 16 16 16 16 16 16
> > > > > > 16
> > > > > > 16 16
> > > > > >
> > > > > > [test]
> > > > > > name: 2D, global_size 4 4 0, local_size 2 2 0
> > > > > > kernel_name: fill2d
> > > > > > dimensions: 2
> > > > > > global_size: 4 4 0
> > > > > > local_size: 2 2 0
> > > > > > arg_out: 0 buffer int[16] 16 16 16 16 16 16 16 16 16 16 16 16 16
> > > > > > 16
> > > > > > 16 16
> > > > > >
> > > > > > [test]
> > > > > > name: 2D, global_size 4 4 0, local_size 4 4 0
> > > > > > kernel_name: fill2d
> > > > > > dimensions: 2
> > > > > > global_size: 4 4 0
> > > > > > local_size: 4 4 0
> > > > > > arg_out: 0 buffer int[16] 16 16 16 16 16 16 16 16 16 16 16 16 16
> > > > > > 16
> > > > > > 16 16
> > > > > >
> > > > > > [test]
> > > > > > name: 3D, global_size 4 4 4, local_size 1 1 1
> > > > > > kernel_name: fill3d
> > > > > > dimensions: 3
> > > > > > global_size: 4 4 4
> > > > > > local_size: 1 1 1
> > > > > > arg_out: 0 buffer int[64] 64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64 64
> > > 64 \
> > > > > >                           64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64
> > > > 64 64 \
> > > > > >                           64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64
> > > > 64 64 \
> > > > > >                           64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64
> > > > > > 64 64
> > > > > >
> > > > > > [test]
> > > > > > name: 3D, global_size 4 4 4, local_size 2 2 2
> > > > > > kernel_name: fill3d
> > > > > > dimensions: 3
> > > > > > global_size: 4 4 4
> > > > > > local_size: 2 2 2
> > > > > > arg_out: 0 buffer int[64] 64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64 64
> > > 64 \
> > > > > >                           64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64
> > > > 64 64 \
> > > > > >                           64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64
> > > > 64 64 \
> > > > > >                           64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64
> > > > > > 64 64
> > > > > >
> > > > > > [test]
> > > > > > name: 3D, global_size 4 4 4, local_size 4 4 4
> > > > > > kernel_name: fill3d
> > > > > > dimensions: 3
> > > > > > global_size: 4 4 4
> > > > > > local_size: 4 4 4
> > > > > > arg_out: 0 buffer int[64] 64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64 64
> > > 64 \
> > > > > >                           64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64
> > > > 64 64 \
> > > > > >                           64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64
> > > > 64 64 \
> > > > > >                           64 64 64 64 64 64 64 64 64 64 64 64 64
> > > > > > 64
> > > > > > 64 64 !*/
> > > > > >
> > > > > > kernel void fill1d(global int* out) {
> > > > > >     unsigned int size = get_global_size(0);
> > > > > > 	unsigned int id = get_global_id(0);
> > > > > >     out[id] = size;
> > > > > > }
> > > > > >
> > > > > > kernel void fill2d(global int* out) {
> > > > > >     unsigned int size = get_global_size(0) * get_global_size(1);
> > > > > >     unsigned int id =
> > > > get_global_id(0)*get_global_size(1)+get_global_id(1);
> > > > > > 	out[id] = size;
> > > > > > }
> > > > > >
> > > > > > kernel void fill3d(global int* out) {
> > > > > >     unsigned int size = get_global_size(0) * get_global_size(1)
> > > > > > *
> > > > get_global_size(2);
> > > > > > 	unsigned int id = (get_global_id(0) * (get_global_size(1) *
> > > > get_global_size(2)))
> > > > > >         + (get_global_id(1) * get_global_size(2)) +
> get_global_id(2);
> > > > > >     out[id] = size;
> > > > > > }
> > > > >
> > > > _______________________________________________
> > > > Beignet mailing list
> > > > Beignet at lists.freedesktop.org
> > > > http://lists.freedesktop.org/mailman/listinfo/beignet
> > >
> > _______________________________________________
> > Beignet mailing list
> > Beignet at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/beignet
> 
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet


More information about the Beignet mailing list