[Beignet] Still cannot use beignet with real OpenCL applications (Haswell CPU)

Zhigang Gong zhigang.gong at gmail.com
Fri Nov 28 21:54:23 PST 2014


Forgot to cc the list.

On Sat, Nov 29, 2014 at 12:10 PM, Zhigang Gong <zhigang.gong at gmail.com> wrote:
> Thanks for reporting issues. For the APP1, I spent some time to investigate it and found the
> root cause is something is wrong in the two newly added llvm builtin intrinsics the bswap and/or
> overflow functions. After disable those two functions, it could pass all the 32 test cases.
> Will continue to investigate and fix them next week.
>
> As to the other two functions, could you share the binary to us? These functions may share the same
> root cause, but need to verify it.
>
> Thanks,
> Zhigang Gong.
>
> On Sat, Nov 29, 2014 at 2:07 AM, Roman Trunov <stream at proxyma.ru> wrote:
>> I've successfully compiled beignet 1.0 on Ubuntu 14.04.1 but still cannot use it for real science and math applications, to which I'm interested in. I see some improvements comparing to 0.9, but my applications still does not function correctly.
>>
>> LLVM: 3.5 (prebuilt package from llvm.org)
>> gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1)
>> Kernel: 3.13.0-40-generic
>> CPU: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
>>
>> from clinfo:
>>
>>   Platform ID:                                   0x7ff0461d1d20
>>   Name:                                          Intel(R) HD Graphics Haswell GT2 Desktop
>>   Vendor:                                        Intel
>>   Device OpenCL C version:                       OpenCL C 1.2 beignet 1.0 (git-9068a26)
>>   Driver version:                                1.0
>>   Profile:                                       FULL_PROFILE
>>   Version:                                       OpenCL 1.2 beignet 1.0 (git-9068a26)
>>
>> I didn't enabled X and Mesa support during compilation so I'm using "drm.rnodes=1" kernel option.
>>
>> Since this is a Haswell CPU, I also rebuild i815 kernel module according to instructions and patch on the site. After this, I've got 100% success on self-tests:
>>
>> summary:
>> ----------
>>   total: 700
>>   run: 700
>>   pass: 699
>>   fail: 0
>>   pass rate: 1.000000
>>
>> So I consider that build was correct.
>>
>> I've tried three math/science application and none of them are working correctly. They run, but producing incorrect results.
>>
>> 1) APP1: distributed.net client for openCL
>>
>> This is a program which is most known to me. There is a "official" prebuilt binary at http://www.distributed.net/Download_prerelease and public part of source code is available at https://github.com/dcti/dnetc-client-base . Also, I could re-compile it from source myself if additional testing will be required.
>>
>> This program has few computation cores. The "reference" core number 0, which is just a few lines long (rc5-ref.cl) kinda crashes:
>>
>> ./dnetc -test rc5-72 0
>>
>> [Nov 26 18:26:19 UTC] RC5-72: using core #0 (CL ANSI 1-pipe).
>> [Nov 26 18:26:20 UTC] Abnormal core termination! Device: 0
>>
>> Other, "optimized" cores could be run but computations are incorrect, most of self-tests fails (the numbers are returned result vs expected result):
>>
>> ./dnetc -test rc5-72 1
>>
>> [Nov 26 18:27:16 UTC] RC5-72: using core #1 (CL 1-pipe).
>> [Nov 26 18:27:16 UTC] RC5-72: Test 01 FAILED2: 00:00000000:00000000-C9:0C0353C0:D4E1FE85
>> [Nov 26 18:27:16 UTC] RC5-72: Test 03 FAILED2: 00:00000000:00000000-0F:556979E7:6C009260
>> [Nov 26 18:27:16 UTC] RC5-72: Test 04 FAILED2: 00:00000000:00000000-9E:D8B648C6:00003A3C
>> [Nov 26 18:27:16 UTC] RC5-72: Test 10 FAILED2: 00:00000000:00000000-2B:E01C5B9D:D65CCAD7
>> [Nov 26 18:27:16 UTC] RC5-72: Test 14 FAILED2: 00:00000000:00000000-C6:46E7E19D:9CD65C85
>> [Nov 26 18:27:16 UTC] RC5-72: Test 16 FAILED2: 00:00000000:00000000-85:EA3678CF:91DB0D2C
>> [Nov 26 18:27:16 UTC] RC5-72: Test 19 FAILED2: 00:00000000:00000000-11:4134BDB0:175A077F
>> [Nov 26 18:27:16 UTC] RC5-72: Test 20 FAILED2: 00:00000000:00000000-94:888FF8CB:282E6E5F
>> [Nov 26 18:27:16 UTC] RC5-72: Test 21 FAILED2: 00:00000000:00000000-D9:48A2E6E4:CD610000
>> [Nov 26 18:27:16 UTC] RC5-72: Test 22 FAILED2: 00:00000000:00000000-E5:71448E83:D0860001
>> [Nov 26 18:27:16 UTC] RC5-72: Test 23 FAILED2: 00:00000000:00000000-3E:ED6D9F85:A6D70002
>> [Nov 26 18:27:16 UTC] RC5-72: Test 26 FAILED1: 56:30E19DF4:8C460000-56:30E19DF4:8C460101
>> [Nov 26 18:27:16 UTC] RC5-72: Test 27 FAILED1: 85:3B37FFD3:9F140000-85:3B37FFD3:9F14B33B
>> [Nov 26 18:27:16 UTC] RC5-72: Test 28 FAILED1: 80:B75263C5:41660000-80:B75263C5:41668D03
>> [Nov 26 18:27:16 UTC] RC5-72: Test 30 FAILED1: 87:23A58F8F:D5940000-87:23A58F8F:D59495C1
>> [Nov 26 18:27:16 UTC] RC5-72: 17/32 Tests Passed (0.320960 seconds)
>> [Nov 26 18:27:16 UTC] RC5-72: WARNING WARNING WARNING: 15 Tests FAILED!!!
>>
>> ./dnetc -stress rc5-72 1
>>
>> [Nov 26 18:26:25 UTC] RC5-72: Stress-test 3: Pipe #1 missed a full match
>> [Nov 26 18:26:25 UTC] RC5-72: Stress-test 3: Pipe #1 fails to set 'check.count'
>> [Nov 26 18:26:25 UTC] Got 0x00000000, expected 0x00000001
>> [Nov 26 18:26:25 UTC] RC5-72: Stress-test 3: Pipe #1 fails to set 'check.hi/mid/lo'
>> [Nov 26 18:26:25 UTC] check:  00:00000000:00000000, expected CA:DB0EF3FF:FFFFFFC0
>> [Nov 26 18:26:25 UTC] RC5-72: Stress-test 3: Pipe #1 - Iterations count not updated
>> [Nov 26 18:26:25 UTC] Got 0x000000C0, expected 0x00000000
>> [Nov 26 18:26:25 UTC] RC5-72: Stress-test 3 FAILED
>>
>>
>>
>> Following APP2 and APP3 are from PrimeGrid (prime numbers search) projects. I don't know is their source available or not, at least I could send you a binary, and cl core source could be easily extracted from executable using hex editor.
>>
>> 2) tpsieve - a sieving program
>>
>> ./tpsieve-cl-boinc-x86_64-linux -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000
>>
>> writes stderr.txt with following:
>>
>> Sieve started: 13120716000000000 <= p < 13120725000000000
>> Thread 0 starting
>> Detected 320 multiprocessors (1600 SPUs) on device 0.
>> Device 0 is a 'Intel' 'Intel(R) HD Graphics Haswell GT2 Desktop'.
>> Computation Error: Checksum mismatch for p=13120716000000031 between 10781334 and 14482143 at n=9000063.
>> Computation Error: Checksum mismatch for p=13120716000000139 between 864764 and 6489009 at n=9000063.
>> (and so on, lot of lines)
>> Aborting because over 1 in 8 p's had computation errors: 2560 of 2560.
>>
>> 3) wwwwcl v2.2.5, a GPU program to search for Wieferich and WallSunSun primes
>>
>> It runs with beignet 1.0, no error messages but it's missing results (comparing output over same range to cpu-only version). Also, in the end of computations "Checksum 0000000000000000" line is printed (this number is non-zero for cpu version)
>>
>>
>> If you need additional testing or information, please let me know. (Remember that I could recompile distributed.net client)
>> _______________________________________________
>> Beignet mailing list
>> Beignet at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/beignet


More information about the Beignet mailing list