[Beignet] Beignet not working on Dell Precision M3800
Zhigang Gong
zhigang.gong at gmail.com
Sun Jun 15 19:07:25 PDT 2014
On Sun, Jun 15, 2014 at 8:17 PM, Yichao Yu <yyc1992 at gmail.com> wrote:
> I have applied the kernel patch[1] as Zhigang Gong suggested off-list
> and it indeed fixes some failing tests (especially some
> out-of-resources errors). However, the test program I was using still
> hangs at exactly the same place (clWaitForEvents).
Did you use the latest git master beignet? You said you met some
out-of-resources error which make me doubt the version you were using,
as that seems a known bug which we fixed several months ago. And if
you were indeed using git master version, that may be a new bug and
please submit some more details here to tell us how to reproduce this
bug if possible. We will be happy to fix it.
Thanks.
>
> P.S. I have finally found a working OpenCL 1.2 implementation
> (intel-opencl-sdk for CPU) and it works fine with the test program so
> I guess it is actually a bug in beignet.
>
> [1] https://aur.archlinux.org/pkgbase/linux-beignet-fix/
>
>
> On Sun, Jun 15, 2014 at 6:44 AM, Yichao Yu <yyc1992 at gmail.com> wrote:
>> On Sun, Jun 15, 2014 at 5:42 AM, Zhigang Gong <zhigang.gong at gmail.com> wrote:
>>>
>>>
>>>> 在 2014年6月15日,6:02,Yichao Yu <yyc1992 at gmail.com> 写道:
>>>>
>>>>> On Sat, Jun 14, 2014 at 11:51 AM, Yichao Yu <yyc1992 at gmail.com> wrote:
>>>>> Sorry for the delay. I was busy graduating and didn't have much time
>>>>> in the past two weeks for testing.
>>>>>
>>>>> The gpu hang is still there and I haven't been able to make a c
>>>>
>>>> Sorry I was wrong, it seems that the hang only happens before I
>>>> upgrade beignet. There are still a lot of failing tests but the screen
>>>> does not freeze anymore.
>>> Which Linux kernel did you using on your previous test? Did you apply the kernel patch which provided by Rong in his email? If you haven't applied the kernel patch, you will not get the slm and barrier work correctly. And all related tests are known broken.
>>
>> I C. I guess that is the reason than...
>>
>> THX
>>
>>>>
>>>>> version of the test program. However, I have found another problem
>>>>> with the newly merged opencl1.2 APIs when testing sth else.
>>>>>
>>>>> The c test program to trigger the issue is here[1]. When running on my
>>>>> Haswell CPU, beignet hangs in clWaitForEvents with the backtrace
>>>>>
>>>>> #0 0x00007ffff78cc9d0 in __nanosleep_nocancel () from
>>>>> /usr/lib/libc.so.6 #1 0x00007ffff78f6c94 in usleep () from
>>>>> /usr/lib/libc.so.6 #2 0x00007ffff73dfc8a in clWaitForEvents
>>>>> (num_events=1, event_list=0x7fffffffda58) at
>>>>> /home/yuyichao/projects/mlinux/pkg/all/beignet-git/src/beignet/src/cl_api.c:1316
>>>>> #3 0x00007ffff7bc861e in clWaitForEvents (num_events=1,
>>>>> event_list=0x7fffffffda58) at ocl_icd_loader.c:873 #4
>>>>> 0x00000000004009aa in main () at beignet-bug2.c:34
>>>>>
>>>>> It seems that the problem only happens for the event returned by
>>>>> clEnqueueBarrierWithWaitList when the wait list is not empty. I hope I
>>>>> am not using the api in the wrong way but I don't have another working
>>>>> opencl 1.2 implementation (pocl crashes on clEnqueueBarrier*...) to
>>>>> test it.........
>>>>>
>>>>> [1] https://gist.github.com/yuyichao/8b661d51c81f1c85466e
>>>>>
>>>>>> On Wed, Jun 4, 2014 at 7:29 AM, Yichao Yu <yyc1992 at gmail.com> wrote:
>>>>>>> On Tue, Jun 3, 2014 at 11:15 PM, Yang, Rong R <rong.r.yang at intel.com> wrote:
>>>>>>> Printf is not a built in function OpenCL 1.1, so beignet don't support it now. However, beignet are supporting it, maybe you could use it soon.
>>>>>>
>>>>>> However, even if the function is not defined, shouldn't the compiler
>>>>>> return a error (opencl error) rather than raising a exception and
>>>>>> abort?
>>>>>>
>>>>>>> Yes, the patch about 3D pipe have not push now, but You can apply by manual and try it.
>>>>>>
>>>>>> I'm afraid I don't have time to test it soon...
>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Yichao Yu [mailto:yyc1992 at gmail.com]
>>>>>>> Sent: Thursday, May 29, 2014 8:40 PM
>>>>>>> To: Yang, Rong R
>>>>>>> Cc: beignet at lists.freedesktop.org
>>>>>>> Subject: Re: [Beignet] Beignet not working on Dell Precision M3800
>>>>>>>
>>>>>>>> On Thu, May 29, 2014 at 4:46 AM, Yang, Rong R <rong.r.yang at intel.com> wrote:
>>>>>>>> I have checked this issue, it is a beignet compiler bug, should be fix by patch "GBE: Change 64bit integer storage in register".
>>>>>>>>
>>>>>>>> For the first problem, I have sent some patch, can you try them? The patch " HSW: Restore L3 control register to disable SLM mode." fix a 3D pipe affect by Beignet bug. May be the same problem you met.
>>>>>>>
>>>>>>> I am testing using the current master
>>>>>>>
>>>>>>> c34eba71bd5a518906d6d5d3ba26e44327cab251
>>>>>>> GBE: fix one illegal instruction when replace a uniform dst.
>>>>>>>
>>>>>>> So the patch u mentioned for 3D pipe doesn't seem to be included yet.
>>>>>>>
>>>>>>> Here are what I saw,
>>>>>>> 1, `printf("%d\n", i);` works on pocl but still crashes the compiler on beignet with the same error.
>>>>>>> 2, the c example I gave works but the original python version does not... Will figure out the difference once I get more time.
>>>>>>> 3, the interference with opengl seems to be different. The same effect I mentioned last time shows up when sth is running on the GPU but recovers afterward. However, it now gives your email a funny texture by replacing some of the characters with another one...[1] (o in this
>>>>>>> case...) I also remember seeing this problem randomly sometime before but it was not as reproducible...
>>>>>>>
>>>>>>> I guess I will test again once those 3d pipe fixing patches are applied.
>>>>>>>
>>>>>>> [1] http://wstaw.org/m/2014/05/29/plasma-desktopzSP722.png
>>>>>>>
>>>>>>> Yichao Yu
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Yichao Yu [mailto:yyc1992 at gmail.com]
>>>>>>>> Sent: Wednesday, May 28, 2014 11:49 PM
>>>>>>>> To: Yang, Rong R
>>>>>>>> Cc: beignet at lists.freedesktop.org
>>>>>>>> Subject: Re: [Beignet] Beignet not working on Dell Precision M3800
>>>>>>>>
>>>>>>>>> On Wed, May 28, 2014 at 11:45 AM, Yichao Yu <yyc1992 at gmail.com> wrote:
>>>>>>>>> On Wed, May 28, 2014 at 10:39 AM, Yichao Yu <yyc1992 at gmail.com> wrote:
>>>>>>>>>>> The second problem is that there seems to be sth wrong if I run two
>>>>>>>>>>> tests in series. More specifically, `test_elwise_kernel`[3],
>>>>>>>>>>> `test_elwise_kernel_with_option`[4] and
>>>>>>>>>>> `test_ranged_elwise_kernel`[5] can all pass if I run them
>>>>>>>>>>> individually. However, if I run them together, only the first one
>>>>>>>>>>> can pass... I will try to reproduce this in C...
>>>>>>>>>>
>>>>>>>>>> Sorry this is NOT what happened... I was not using the right
>>>>>>>>>> parameter to select the tests and there isn't any (at least no
>>>>>>>>>> evidence for it) interference between kernels.
>>>>>>>>>> The problem is rather the test_elsize_kernel_with_option and
>>>>>>>>>> test_ranged_elwise_kernel are not working..
>>>>>>>>>> Also the failing one sometimes (~2 times in 8) hang the wm for ~10s...
>>>>>>>>>> will try to make a c version....
>>>>>>>>>
>>>>>>>>> And it seems that none of them is actually working, just that when
>>>>>>>>> the difference is calculated using OpenCL, it always returns 0...
>>>>>>>>>
>>>>>>>>> so here[1] is the c version. The problem seems to be related to the
>>>>>>>>> use of get_local_size and/or get_group_id in the kernel. When I was
>>>>>>>>> using a simple kernel with `int i = get_global_id(0);`, everything
>>>>>>>>> works fine.
>>>>>>>>
>>>>>>>> I haven't applied the patch for using local memory in the kernel. Does that patch affect not only local memory but also local size somehow?
>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] https://gist.github.com/yuyichao/242fd2a812088930af91
>>>>>>>>>
>>>>>>>>> P.S. I was trying to use printf in the kernel and it seems to crash
>>>>>>>>> the compiler..... Not sure if I was using it correctly but I guess it
>>>>>>>>> shouldn't crash in any case...
>>>>>>>>>
>>>>>>>>> here is the error:
>>>>>>>>> ```
>>>>>>>>> ASSERTION FAILED: it != instrinsicMap.map.end() at file
>>>>>>>>> /home/yuyichao/projects/mlinux/pkg/all/beignet-git/src/beignet/backen
>>>>>>>>> d /src/llvm/llvm_gen_backend.cpp, function void
>>>>>>>>> gbe::GenWriter::regAllocateCallInst(llvm::CallInst&),
>>>>>>>>> line 2115 [1] 28951 trace trap (core dumped) ./beignet-bug
>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> with the following kernel (not sure if it is valid haven't use printf
>>>>>>>>> before....),
>>>>>>>>>
>>>>>>>>> ```
>>>>>>>>> __kernel void fill_one(__global float *out, long n) {
>>>>>>>>> int i = get_global_id(0);
>>>>>>>>> printf("%d\n", i);
>>>>>>>>> if (i < n) {
>>>>>>>>> out[i] = 1;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>> ```
>>>>>>>>> (this kernel (without printf) works btw....)
>>>>>>>>>
>>>>>>>>> Yichao Yu
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [1] http://wstaw.org/m/2014/05/28/plasma-desktopObn722.png
>>>>>>>>>>> [2] http://wstaw.org/m/2014/05/28/plasma-desktopWbB722.png
>>>>>>>>>>> [3]
>>>>>>>>>>> https://github.com/pyopencl/pyopencl/blob/master/test/test_algorith
>>>>>>>>>>> m
>>>>>>>>>>> .py#L45 [4]
>>>>>>>>>>> https://github.com/pyopencl/pyopencl/blob/master/test/test_algorith
>>>>>>>>>>> m
>>>>>>>>>>> .py#L66 [5]
>>>>>>>>>>> https://github.com/pyopencl/pyopencl/blob/master/test/test_algorith
>>>>>>>>>>> m
>>>>>>>>>>> .py#L97
>>>>>>>>>>>
>>>>>>>>>>> Yours,
>>>>>>>>>>> Yichao Yu
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for point out it, I have sent a patch to correct it.
>>>>>>>>>>>
>>>>>>>>>>> Seems fixed. THX. =)
>>>> _______________________________________________
>>>> Beignet mailing list
>>>> Beignet at lists.freedesktop.org
>>>> http://lists.freedesktop.org/mailman/listinfo/beignet
More information about the Beignet
mailing list