<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - ViennaCL fails dense_blas-bench-opencl benchmark with doubles on AMD CYPRESS (DRM 2.43.0, LLVM 3.8.0)"
href="https://bugs.freedesktop.org/show_bug.cgi?id=96881">96881</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>ViennaCL fails dense_blas-bench-opencl benchmark with doubles on AMD CYPRESS (DRM 2.43.0, LLVM 3.8.0)
</td>
</tr>
<tr>
<th>Product</th>
<td>Mesa
</td>
</tr>
<tr>
<th>Version</th>
<td>11.2
</td>
</tr>
<tr>
<th>Hardware</th>
<td>x86-64 (AMD64)
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux (All)
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Component</th>
<td>Drivers/Gallium/r600
</td>
</tr>
<tr>
<th>Assignee</th>
<td>dri-devel@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>ubizjak@gmail.com
</td>
</tr>
<tr>
<th>QA Contact</th>
<td>dri-devel@lists.freedesktop.org
</td>
</tr></table>
<p>
<div>
<pre>The dense_blas-bench-opencl benchmark from ViennaCL suite fails with doubles on
AMD CYPRESS (DRM 2.43.0, LLVM 3.8.0):
$ ./dense_blas-bench-opencl
----------------------------------------------
Device Info
----------------------------------------------
Name: AMD CYPRESS (DRM 2.43.0, LLVM 3.8.0)
Vendor: AMD
Type: GPU
Available: 1
Max Compute Units: 10
Max Work Group Size: 256
Global Mem Size: 1073741824
Local Mem Size: 32768
Local Mem Type: 1
Host Unified Memory: 1
Benchmark : BLAS
----------------
sCOPY : 64.3 GB/s
sAXPY : 95.4 GB/s
sDOT : 85.3 GB/s
sGEMV-N : 20.8 GB/s
sGEMV-T : 44.3 GB/s
sGEMM-NN : 126 GFLOPs/s
sGEMM-NT : 87.6 GFLOPs/s
sGEMM-TN : 90.5 GFLOPs/s
sGEMM-TT : 72.3 GFLOPs/s
----
Build Status = -2 ( Err = -11 )
Log: unsupported call to function __subdf3 in av_cpu
Sources: #pragma OPENCL EXTENSION cl_khr_fp64 : enable
__kernel void av_cpu(
__global double * vec1,
uint4 size1,
...
It looks like DFmode (double) instructions are not enabled correctly in LLVM
for targets that report cl_khr_fp64 extension.
clinfo reports:
Number of platforms 1
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 MESA 11.2.2
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA
Platform Name Clover
Number of devices 1
Device Name AMD CYPRESS (DRM 2.43.0, LLVM
3.8.0)
Device Vendor AMD
Device Vendor ID 0x1002
Device Version OpenCL 1.1 MESA 11.2.2
Driver Version 11.2.2
Device OpenCL C Version OpenCL C 1.1
Device Type GPU
Device Profile FULL_PROFILE
Max compute units 10
Max clock frequency 850MHz
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple 64
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 0 (n/a)
float 4 / 4
double 2 / 2
(cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
...</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>