<div dir="ltr"><div style><div>When compiling a simple single function kernel (patch attached, also cut and pasted below), the compiler asserted on unsupported intrinsics:</div><div><div><br></div><div>ASSERTION FAILED: Unsupported intrinsics</div>
<div> at file /root/WORK/test_split/beignet/backend/src/llvm/llvm_gen_backend.cpp, function void gbe::GenWriter::regAllocateCallInst(llvm::CallInst&), line 1688</div><div>Stack dump:</div><div>0.<span class="" style="white-space:pre"> </span>Running pass 'Function Pass Manager' on module '/tmp/file0yciwa.ll'.</div>
<div>1.<span class="" style="white-space:pre"> </span>Running pass 'Gen Back-End' on function '@test_split'</div></div></div><div style><br></div><div style>It looks like this is triggered by the "shr_mem[pid+offset]=0;" line when 0 is assigned. Is this a bug? It looks like the kernel code is not doing anything illegal. It compiled ok using Intel OpenCL SDK 2013 (on Linux, for CPU)<br>
</div><div style><br></div><div style>Thanks,</div><div style>/Ed</div><div style><br></div><div style><a href="http://test_split.cl">test_split.cl</a>:</div><div style><br></div><div><div>__kernel void </div><div>test_split(</div>
<div><span class="" style="white-space:pre"> </span>__global int *in, </div><div><span class="" style="white-space:pre"> </span>int stride, </div><div><span class="" style="white-space:pre"> </span>int n_recs, </div><div>
<span class="" style="white-space:pre"> </span>int n_parts, </div><div><span class="" style="white-space:pre"> </span>__global int *out,</div><div><span class="" style="white-space:pre"> </span>__local int *shr_mem)</div>
<div>{</div><div><span class="" style="white-space:pre"> </span>int pid;</div><div><span class="" style="white-space:pre"> </span>int glb_tid = get_global_id(0);</div><div><span class="" style="white-space:pre"> </span>int loc_tid = get_local_id(0);</div>
<div><span class="" style="white-space:pre"> </span>int offset = loc_tid*n_parts;</div><div><br></div><div><span class="" style="white-space:pre"> </span>for(pid=0; pid<n_parts; pid++)</div><div><span class="" style="white-space:pre"> </span>{</div>
<div><span class="" style="white-space:pre"> </span>shr_mem[pid+offset]=0;</div><div><span class="" style="white-space:pre"> </span>}</div><div><br></div><div><span class="" style="white-space:pre"> </span>for(int pos=glb_tid; pos<n_recs; pos+=stride)</div>
<div><span class="" style="white-space:pre"> </span>{</div><div><span class="" style="white-space:pre"> </span>pid=in[pos];</div><div><span class="" style="white-space:pre"> </span>shr_mem[pid+offset]++;</div><div><span class="" style="white-space:pre"> </span>}</div>
<div><br></div><div><span class="" style="white-space:pre"> </span>for(pid=0; pid<n_parts; pid++)</div><div><span class="" style="white-space:pre"> </span>{</div><div><span class="" style="white-space:pre"> </span>out[glb_tid + stride*pid] = shr_mem[pid+offset];</div>
<div><span class="" style="white-space:pre"> </span>}<span class="" style="white-space:pre"> </span></div><div>}</div></div><div><br></div><div style>compiler_test_split.cpp:</div><div style><br></div><div style><div>void compiler_test_split(void)</div>
<div>{</div><div> OCL_CREATE_KERNEL("test_split");<br></div><div>}</div></div><div><br></div><div style><br></div><div><br></div><div style><br></div></div>