[Mesa-dev] OpenCL for radeon Hawaii?

Thu Aug 4 22:00:13 UTC 2016

On Thu, Aug 4, 2016 at 2:54 PM, Sam Halliday <sam.halliday at gmail.com> wrote:

> Thanks Jan,
>
> This is what it comes up with
> https://gist.github.com/fommil/c97d4c8fb2790e28ecaf8d334ebf1746
>
> Are there any demo apps I could expect to run with this? What is
> involved in writing missing functionality?
>

I'll let others address the demo apps question for the most part.  I know
tstellard has worked with opencv compatibility a bit, I had previously
tried GEGL, attempted luxmark/juliaGPU, and some of the FinanceBench that
Phoronix uses is runnable. I previously ran BFGMiner for bitcoin mining as
well.

Things you can try to do to improve the current situation:

1) If an application fails to compile CL kernels, it'd be good to get build
logs and kernels so that the cause can be investigated. Often, the issue is
because clover/libclc don't have an implementation of that functionality.

You can dump the list of kernels and the LLVM IR when a program runs by
doing the following:
CLOVER_DEBUG_FILE=clover_dump CLOVER_DEBUG=clc,llvm,asm
PATH_TO_YOUR_TEST_PROGRAM

That'll generate a set of files called clover_dump.cl, clover_dump.ll,
clover_dump.asm with:
a) The CL source that the program tried to compile
b) The LLVM IR for the CL source.
c) The generated machine code for the LLVM IR on your card.

If the CL source is missing built-in function implementations, libclc (
libclc.llvm.org) will gladly accept patches to implement them.

If you want to just do a test-compile of the CL source, I use the following
command (with libclc/llvm/mesa all in /usr/local/):
clang -S -emit-llvm -o $1.ll -include /usr/local/include/clc/clc.h
-I/usr/local/include/ -Dcl_clang_storage_class_specifiers -target amdgcn--
-mcpu=pitcairn -c $1

2) If the application crashes outright (such as CLBlast), it could be
because we are missing CL API functions that should be implemented, or
because the program is abusing the spec.

My general thought is that it's most rewarding to pick an application that
you want to see working, and then keep fixing things until it runs.

That mostly has meant implementing libclc built-ins for me.  I've been
working on getting MultiCoreWare's cppamp driver working on Mesa/Clover (
https://bitbucket.org/multicoreware/hcc/wiki/Home).  We're only 4-ish
built-ins (and a static keyword from CL 1.2's storage class specifiers)
away from that working, which would be nice.

Beyond cppamp, I'd be interested in someone tracking down more precision
tests for the floating point and geometric functions that we haven't
implemented yet and turning those into piglit tests (or at least
identifying projects that have those tests and how to build/run them).

--Aaron

>
> Number of platforms 1
> Platform Name Clover
> Platform Vendor Mesa
> Platform Version OpenCL 1.1 Mesa 12.0.1
> Platform Profile FULL_PROFILE
> Platform Extensions cl_khr_icd
> Platform Extensions function suffix MESA
> Platform Name Clover
> Number of devices 1
> Device Name AMD HAWAII (DRM 2.43.0 / 4.6.4-1-ARCH, LLVM 3.8.0)
> Device Vendor AMD
> Device Vendor ID 0x1002
> Device Version OpenCL 1.1 Mesa 12.0.1
> Driver Version 12.0.1
> Device OpenCL C Version OpenCL C 1.1
> Device Type GPU
> Device Profile FULL_PROFILE
> Max compute units 44
> Max clock frequency 1090MHz
> Max work item dimensions 3
> Max work item sizes 256x256x256
> Max work group size 256
> Preferred work group size multiple 64
> Preferred / native vector sizes
> char 16 / 16
> short 8 / 8
> int 4 / 4
> long 2 / 2
> half 0 / 0 (n/a)
> float 4 / 4
> double 2 / 2 (cl_khr_fp64)
> Half-precision Floating-point support (n/a)
> Single-precision Floating-point support (core)
> Denormals No
> Infinity and NANs Yes
> Round to nearest Yes
> Round to zero No
> Round to infinity No
> IEEE754-2008 fused multiply-add No
> Support is emulated in software No
> Correctly-rounded divide and sqrt operations No
> Double-precision Floating-point support (cl_khr_fp64)
> Denormals Yes
> Infinity and NANs Yes
> Round to nearest Yes
> Round to zero Yes
> Round to infinity Yes
> IEEE754-2008 fused multiply-add Yes
> Support is emulated in software No
> Correctly-rounded divide and sqrt operations No
> Address bits 32, Little-Endian
> Global memory size 1073741824 (1024MiB)
> Error Correction support No
> Max memory allocation 268435456 (256MiB)
> Unified memory for Host and Device Yes
> Minimum alignment for any data type 128 bytes
> Alignment of base address 1024 bits (128 bytes)
> Global Memory cache type None
> Image support No
> Local memory type Local
> Local memory size 32768 (32KiB)
> Max constant buffer size 268435456 (256MiB)
> Max number of constant args 16
> Max size of kernel argument 1024
> Queue properties
> Out-of-order execution No
> Profiling Yes
> Profiling timer resolution 0ns
> Execution capabilities
> Run OpenCL kernels Yes
> Run native kernels No
> Device Available Yes
> Compiler Available Yes
> Device Extensions cl_khr_global_int32_base_atomics
> cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics
> cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store
> cl_khr_fp64
> NULL platform behavior
> clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Clover
> clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [MESA]
> clCreateContext(NULL, ...) [default] Success [MESA]
> clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in
> platform
> clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
> Platform Name Clover
> Device Name AMD HAWAII (DRM 2.43.0 / 4.6.4-1-ARCH, LLVM 3.8.0)
> clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices
> found in platform
> clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found
> in platform
> clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
> Platform Name Clover
> Device Name AMD HAWAII (DRM 2.43.0 / 4.6.4-1-ARCH, LLVM 3.8.0)
> ICD loader properties
> ICD loader Name OpenCL ICD Loader
> ICD loader Vendor OCL Icd free software
> ICD loader Version 2.2.9
> ICD loader Profile OpenCL 2.1
>
> On 3 August 2016 at 21:33, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> > Hi,
> >
> > you can use clinfo utility to check whether OpenCL has been setup
> > correctly. It won't be able to run luxmark, but simpler stuff should
> > work.
> >
> > Jan
> >
> > On Wed, 2016-08-03 at 20:45 +0100, Sam Halliday wrote:
> >> Hello everybody,
> >>
> >> I recently got a Radeon R9 290x (aka Hawaii). I run ArchLinux, which
> >> has
> >> mesa 12.0.1.
> >>
> >> I am keen to try out mesa's partial OpenCL implementation. Ideally to
> >> use a BLAS implementation such as https://github.com/CNugteren/CLBlas
> >> t
> >> but I'd also be happy to just write something basic like dense
> >> matrix/matrix multiplication in OpenCL.
> >>
> >>
> >> However, when I tried to run the "luxmark" OpenCL benchmark (the only
> >> way I could think to test if my card is supported), I got this error
> >>
> >> [PathOCLBaseRenderThread::0] Compiling kernels
> >> [PathOCLBaseRenderThread::0] PathOCL kernel compilation error ERROR
> >> clBuildProgram[CL_INVALID_BUILD_OPTIONS]:
> >> RUNTIME ERROR: PathOCLBase kernel compilation error
> >>
> >> which, I'm guessing, means that my GPU isn't supported for OpenCL by
> >> mesa yet. Is that correct? (I don't know how to get any more output
> >> or
> >> logs than this).
> >>
> >>
> >> Could somebody please help by letting me know if there is a ticket I
> >> could subscribe to track progress of support for my card (a simple
> >> search of the bug database didn't bring up anything obvious). This
> >> message is to the -dev list, so I am suppose I am saying that I am
> >> prepared to get my hands dirty... but I am primarily a Scala
> >> developer
> >> and haven't done any C in years so the extent of my help is limited.
> >>
> >> If somebody who knows what they are doing would be willing to
> >> implement
> >> some of the functionality needed, I'd be prepared to buy this GPU for
> >> them to use for their hacking - it's the least I could do (but it is
> >> an
> >> absolute monster, I didn't even know GPUs could be this big! I needed
> >> to
> >> get a bigger case for it).
> >>
> >>
> >> Somewhat tangentially, if OpenCL support is really not a possibility
> >> anytime soon, could somebody please point me in the direction of a
> >> way
> >> to use this card programmatically for something like matrix/matrix
> >> multiplication? (I'm prepared to go really low level if there is
> >> sufficient documentation).
> >>
> >>
> >> I'm not at all interested in using proprietary drivers for OpenCL.
> >>
> >>
> > --
> > Jan Vesely <jan.vesely at rutgers.edu>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160804/195a54f3/attachment-0001.html>