[Beignet] [ANNOUNCE] Beignet 0.9.0 (2014-6-26)
zhigang.gong at linux.intel.com
Wed Jun 25 23:07:44 PDT 2014
Beignet 0.9 (2014-06-26)
Beignet version 0.9 has been released. This is another major release of
Beignet. This release implements OpenCL 1.2 interfaces, and supports more
platforms. This release also brings about 10x ~ 20x performance improvement
compared to the version 0.8.
The highlighted improvements are as below:
* Added 4th Generation Intel Core Processors support.
* Added Intel "Bay Trail" platform with Intel HD Graphics support.
* Significant performance improvement compared to 0.8. For Luxmark benchmark
and some OpenCV performance test cases, we measured 10x-20x performance
* Compile speed up about 30% compared to 0.8.
* Support OpenCL spec 1.2. Support printf in GPU kernel side which is very
helpful for kernel debugging. Support both clLinkProgram and clCompileProgram
which allow application to compile and link the opencl binaries at runtime
and is faster than rebuilding everything.
* Support runtime library separate from the compiler backend. For mobile
system which don't need to compile kernel dynamically, we can strip down
the Beignet to less than 2MB. Which is very suitable for small footprint
* Update documents including how to optimize kernels and how to do corss-compile
* 3rd Generation Intel Core Processors
* Intel “Bay Trail” platforms with Intel HD Graphics
* 4rd Generation Intel Core Processors
Please be noted:
The 4rd Generation Inter Core Processors's support requires some Linux kernel
modification. You need to apply the patch at:
* Implemented all mandatory APIs required by OpenCL spec 1.2,
including clEnqueueFillBuffer, clEnqueueMarkerWithWaitList,
clCreateProgramWithBuiltInKernels, 1D image and 1D image array
2D image array support, clCreateSubDevice, clCompileProgram,
clLinkProgram, clGetKernelArgInfo, clEnqueueMigrateMemObjects,
clUnloadPlatformCompiler, clEnqueueFillImage etc.
* Implement strict/non strict conformance mode and support dynamic
switching between two modes.
* Use IF/ENDIF to encode each basic block, thus we can introduce more
structured instructions latter to do further optimization.
* Optimize the bool type's processing. This can avoid many corner case of the
bool value handling and save many instructions when encoding CMP/SEL
instruction after the if/endif change.
* Added two extension instruction __simd_any() and __simd_all().
* Support runtime library detaching with the compiler backend. Could reduce the
whole library to less than 2M if the backend library is not required, which is
very suitable for embedded/mobile system.
* Implement compact instruction.
* Use dword load as much as possible by using logic shift to support shorter data load.
* Use vector load/store as much as possible by gather contiguous load/store at LLVM IR layer.
* Optimize long type's processing by changing the register layout.
* Fixed the L3 cache configuration bug.
* Optimize PHI MOV to eliminate unecessary phi copy as much as possible.
* Use sample LD message to work around the int/uint type surface sampling restrication.
* Support printf builtin functions which is very helpful for kernel debuging.
* Implement uniform value analysis.
* Implement post register allocation scheduling.
Git tag: Release_v0.9
Gitweb url: http://cgit.freedesktop.org/beignet
More information about the Beignet