[Beignet] Still cannot use beignet with real OpenCL applications (Haswell CPU)

Roman Trunov stream at proxyma.ru
Fri Nov 28 10:07:24 PST 2014


I've successfully compiled beignet 1.0 on Ubuntu 14.04.1 but still cannot use it for real science and math applications, to which I'm interested in. I see some improvements comparing to 0.9, but my applications still does not function correctly.

LLVM: 3.5 (prebuilt package from llvm.org)
gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1)
Kernel: 3.13.0-40-generic
CPU: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz

from clinfo:

  Platform ID:                                   0x7ff0461d1d20
  Name:                                          Intel(R) HD Graphics Haswell GT2 Desktop
  Vendor:                                        Intel
  Device OpenCL C version:                       OpenCL C 1.2 beignet 1.0 (git-9068a26)
  Driver version:                                1.0
  Profile:                                       FULL_PROFILE
  Version:                                       OpenCL 1.2 beignet 1.0 (git-9068a26)

I didn't enabled X and Mesa support during compilation so I'm using "drm.rnodes=1" kernel option.

Since this is a Haswell CPU, I also rebuild i815 kernel module according to instructions and patch on the site. After this, I've got 100% success on self-tests:

summary:
----------
  total: 700
  run: 700
  pass: 699
  fail: 0
  pass rate: 1.000000

So I consider that build was correct.

I've tried three math/science application and none of them are working correctly. They run, but producing incorrect results.

1) APP1: distributed.net client for openCL

This is a program which is most known to me. There is a "official" prebuilt binary at http://www.distributed.net/Download_prerelease and public part of source code is available at https://github.com/dcti/dnetc-client-base . Also, I could re-compile it from source myself if additional testing will be required.

This program has few computation cores. The "reference" core number 0, which is just a few lines long (rc5-ref.cl) kinda crashes:

./dnetc -test rc5-72 0

[Nov 26 18:26:19 UTC] RC5-72: using core #0 (CL ANSI 1-pipe).
[Nov 26 18:26:20 UTC] Abnormal core termination! Device: 0

Other, "optimized" cores could be run but computations are incorrect, most of self-tests fails (the numbers are returned result vs expected result):

./dnetc -test rc5-72 1

[Nov 26 18:27:16 UTC] RC5-72: using core #1 (CL 1-pipe).
[Nov 26 18:27:16 UTC] RC5-72: Test 01 FAILED2: 00:00000000:00000000-C9:0C0353C0:D4E1FE85
[Nov 26 18:27:16 UTC] RC5-72: Test 03 FAILED2: 00:00000000:00000000-0F:556979E7:6C009260
[Nov 26 18:27:16 UTC] RC5-72: Test 04 FAILED2: 00:00000000:00000000-9E:D8B648C6:00003A3C
[Nov 26 18:27:16 UTC] RC5-72: Test 10 FAILED2: 00:00000000:00000000-2B:E01C5B9D:D65CCAD7
[Nov 26 18:27:16 UTC] RC5-72: Test 14 FAILED2: 00:00000000:00000000-C6:46E7E19D:9CD65C85
[Nov 26 18:27:16 UTC] RC5-72: Test 16 FAILED2: 00:00000000:00000000-85:EA3678CF:91DB0D2C
[Nov 26 18:27:16 UTC] RC5-72: Test 19 FAILED2: 00:00000000:00000000-11:4134BDB0:175A077F
[Nov 26 18:27:16 UTC] RC5-72: Test 20 FAILED2: 00:00000000:00000000-94:888FF8CB:282E6E5F
[Nov 26 18:27:16 UTC] RC5-72: Test 21 FAILED2: 00:00000000:00000000-D9:48A2E6E4:CD610000
[Nov 26 18:27:16 UTC] RC5-72: Test 22 FAILED2: 00:00000000:00000000-E5:71448E83:D0860001
[Nov 26 18:27:16 UTC] RC5-72: Test 23 FAILED2: 00:00000000:00000000-3E:ED6D9F85:A6D70002
[Nov 26 18:27:16 UTC] RC5-72: Test 26 FAILED1: 56:30E19DF4:8C460000-56:30E19DF4:8C460101
[Nov 26 18:27:16 UTC] RC5-72: Test 27 FAILED1: 85:3B37FFD3:9F140000-85:3B37FFD3:9F14B33B
[Nov 26 18:27:16 UTC] RC5-72: Test 28 FAILED1: 80:B75263C5:41660000-80:B75263C5:41668D03
[Nov 26 18:27:16 UTC] RC5-72: Test 30 FAILED1: 87:23A58F8F:D5940000-87:23A58F8F:D59495C1
[Nov 26 18:27:16 UTC] RC5-72: 17/32 Tests Passed (0.320960 seconds)
[Nov 26 18:27:16 UTC] RC5-72: WARNING WARNING WARNING: 15 Tests FAILED!!!

./dnetc -stress rc5-72 1

[Nov 26 18:26:25 UTC] RC5-72: Stress-test 3: Pipe #1 missed a full match
[Nov 26 18:26:25 UTC] RC5-72: Stress-test 3: Pipe #1 fails to set 'check.count'
[Nov 26 18:26:25 UTC] Got 0x00000000, expected 0x00000001
[Nov 26 18:26:25 UTC] RC5-72: Stress-test 3: Pipe #1 fails to set 'check.hi/mid/lo'
[Nov 26 18:26:25 UTC] check:  00:00000000:00000000, expected CA:DB0EF3FF:FFFFFFC0
[Nov 26 18:26:25 UTC] RC5-72: Stress-test 3: Pipe #1 - Iterations count not updated
[Nov 26 18:26:25 UTC] Got 0x000000C0, expected 0x00000000
[Nov 26 18:26:25 UTC] RC5-72: Stress-test 3 FAILED



Following APP2 and APP3 are from PrimeGrid (prime numbers search) projects. I don't know is their source available or not, at least I could send you a binary, and cl core source could be easily extracted from executable using hex editor.

2) tpsieve - a sieving program

./tpsieve-cl-boinc-x86_64-linux -p13120716e9 -P13120725e9 -k5 -K9999 -n6000000 -N9000000

writes stderr.txt with following:

Sieve started: 13120716000000000 <= p < 13120725000000000
Thread 0 starting
Detected 320 multiprocessors (1600 SPUs) on device 0.
Device 0 is a 'Intel' 'Intel(R) HD Graphics Haswell GT2 Desktop'.
Computation Error: Checksum mismatch for p=13120716000000031 between 10781334 and 14482143 at n=9000063.
Computation Error: Checksum mismatch for p=13120716000000139 between 864764 and 6489009 at n=9000063.
(and so on, lot of lines)
Aborting because over 1 in 8 p's had computation errors: 2560 of 2560.

3) wwwwcl v2.2.5, a GPU program to search for Wieferich and WallSunSun primes

It runs with beignet 1.0, no error messages but it's missing results (comparing output over same range to cpu-only version). Also, in the end of computations "Checksum 0000000000000000" line is printed (this number is non-zero for cpu version)


If you need additional testing or information, please let me know. (Remember that I could recompile distributed.net client)


More information about the Beignet mailing list