[Beignet] [PATCH] Fix HSW thread_n <= 64 assert.

Yang Rong rong.r.yang at intel.com
Mon Oct 13 23:12:38 PDT 2014


In function cl_get_kernel_max_wg_sz, hsw's thread count may large than 64,
add a max limit.

Signed-off-by: Yang Rong <rong.r.yang at intel.com>
---
 src/cl_device_id.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/cl_device_id.c b/src/cl_device_id.c
index a0d0db6..7944ca4 100644
--- a/src/cl_device_id.c
+++ b/src/cl_device_id.c
@@ -633,7 +633,7 @@ cl_check_builtin_kernel_dimension(cl_kernel kernel, cl_device_id device)
 LOCAL size_t
 cl_get_kernel_max_wg_sz(cl_kernel kernel)
 {
-  size_t work_group_size;
+  size_t work_group_size, thread_cnt;
   int simd_width = interp_kernel_get_simd_width(kernel->opaque);
   int vendor_id = kernel->program->ctx->device->vendor_id;
   if (!interp_kernel_use_slm(kernel->opaque)) {
@@ -642,9 +642,13 @@ cl_get_kernel_max_wg_sz(cl_kernel kernel)
     else
       work_group_size = kernel->program->ctx->device->max_compute_unit *
                         kernel->program->ctx->device->max_thread_per_unit * simd_width;
-  } else
-    work_group_size = kernel->program->ctx->device->max_compute_unit * simd_width *
+  } else {
+    thread_cnt = kernel->program->ctx->device->max_compute_unit *
                  kernel->program->ctx->device->max_thread_per_unit / kernel->program->ctx->device->sub_slice_count;
+    if(thread_cnt > 64)
+      thread_cnt = 64;
+    work_group_size = thread_cnt * simd_width;
+  }
   return work_group_size;
 }
 
-- 
1.8.3.2



More information about the Beignet mailing list