[Bug 96897] clpeak OpenCL benchmark hangs during compilation on Clover RadeonSI
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Fri May 18 02:45:20 UTC 2018
https://bugs.freedesktop.org/show_bug.cgi?id=96897
--- Comment #14 from Dieter Nützel <Dieter at nuetzel-hh.de> ---
(In reply to Jan Vesely from comment #13)
> Initial support for cl_khr_fp16 builtins has been added to libclc in r332677.
> It should be enough to run clpeak.
> clpeak still takes few mins to compile the kernels (~7mins on my carrizo
> laptop)
GREAT work Jan!
After 3 min and ~12 sec float start crunching on my X3470 Xeon
(only one core would be used for kernel compile => 3.6 GHz turbo mode)
My desktop was frozen during float 'Global memory bandwidth (GBPS)' compute
and partly frozen during 'Double-precision compute (GFLOPS)'.
Whole benchmark finished after 6 min and 17 secs.
/home/dieter> time clpeak
Platform: Clover
Device: Radeon RX 580 Series (POLARIS10, DRM 3.23.0,
4.16.9-1.g4f45b1e-default, LLVM 7.0.0)
Driver version : 18.2.0-devel (Linux x64)
Compute units : 36
Clock frequency : 1411 MHz
Global memory bandwidth (GBPS)
float : 2.64
float2 : 2.64
float4 : 2.64
float8 : 2.54
float16 : 1.45
Single-precision compute (GFLOPS)
float : 6341.87
float2 : 6131.34
float4 : 6105.61
float8 : 5933.91
float16 : 5939.44
half-precision compute (GFLOPS)
half : 6307.47
half2 : 6193.25
half4 : 6114.34
half8 : 5729.57
half16 : 6047.90
Double-precision compute (GFLOPS)
double : 404.52
double2 : 404.41
double4 : 404.06
double8 : 403.08
double16 : 401.53
Integer compute (GIOPS)
int : 1222.75
int2 : 1213.90
int4 : 1210.72
int8 : 1208.57
int16 : 1213.99
Transfer bandwidth (GBPS)
enqueueWriteBuffer : 8.78
enqueueReadBuffer : 4.86
enqueueMapBuffer(for read) : 4871.79
memcpy from mapped ptr : 4.94
enqueueUnmap(after write) : 3528.56
memcpy to mapped ptr : 4.94
Kernel launch latency : 293.57 us
206.285u 3.765s 6:17.14 55.6% 0+0k 0+0io 0pf+0w
For reference AMD 17.40
/home/dieter> time clpeak
Platform: AMD Accelerated Parallel Processing
Device: Ellesmere
Driver version : 2482.3 (Linux x64)
Compute units : 36
Clock frequency : 1411 MHz
Global memory bandwidth (GBPS)
float : 202.59
float2 : 209.30
float4 : 209.63
float8 : 162.15
float16 : 138.41
Single-precision compute (GFLOPS)
float : 6342.71
float2 : 6374.96
float4 : 6178.29
float8 : 5973.53
float16 : 6018.79
half-precision compute (GFLOPS)
half : 6306.97
half2 : 6366.06
half4 : 6350.41
half8 : 6154.31
half16 : 6280.47
Double-precision compute (GFLOPS)
double : 404.64
double2 : 404.38
double4 : 398.54
double8 : 403.25
double16 : 401.53
Integer compute (GIOPS)
int : 1206.77
int2 : 1221.26
int4 : 1225.83
int8 : 1225.88
int16 : 1227.35
Transfer bandwidth (GBPS)
enqueueWriteBuffer : 9.03
enqueueReadBuffer : 5.08
enqueueMapBuffer(for read) : 149130.81
memcpy from mapped ptr : 5.09
enqueueUnmap(after write) : 75882.81
memcpy to mapped ptr : 5.08
Kernel launch latency : 93.33 us
23.056u 1.592s 1:08.29 36.0% 0+0k 0+0io 0pf+0w
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180518/0d060318/attachment-0001.html>
More information about the dri-devel
mailing list