[Beignet] [ANNOUNCE] Beignet 1.0.1 (2015-01-19)
Zhigang Gong
zhigang.gong at linux.intel.com
Mon Jan 19 00:40:48 PST 2015
Beignet 1.0.1 (2015-01-19)
==========================
Beignet 1.0.1 has been released. It's a bug fix release including some bug fixes, minor new
features and some performance improvements. The highlighted improvements are as below:
1. Enable userptr support by default thus it's possible to achieve zero-copy
when create a buffer object.
2. Change to use TILING_Y mode on BDW platform which lead to a major performance
improvement for some cases.
3. Fix one liveness bug which is the last such type of known bug in the Gen Backend.
4. Fix some builtin math function's accuracy bugs.
5. Improve the strict mode sin/cos implementation, reduce the instruction count
from 1700 to 400.
6. Use Clang native sampler and image types, now image/sampler are fully comply
with OpenCL spec.
7. Fix bugs triggered by some popular applications such as darktable. Now darktable works
fine on all the supported platforms.
8. Add support for old system which doesn't have c++11 features.
Git tag: Release_v1.0.1
Gitweb URL: http://cgit.freedesktop.org/beignet
Download: https://01.org/sites/default/files/beignet-1.0.1-src.tar.gz
Official release notes: https://01.org/beignet/downloads/beignet-1.0.1-2015-01-19
md5sum: cbb27ed5f436c2bfa87d869857829181 Beignet-1.0.1-Src.tar.gz
sha1sum: b9c9da9cee164a1c0dcd64e07495986752d134ae Beignet-1.0.1-Src.tar.gz
sha256sum: 4ca5e093d2fd3f0c2615929b293b0f65e237085d74486c0b85e4d1ecaf79793b Beignet-1.0.1-Src.tar.gz
-----------------------------------------------------------------
Changes since 1.0.0:
Chuanbo Weng (1):
Change CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR from 8 to 16.
Guo Yejun (20):
re-enable userptr with fix: CPU access after GPU finishes the rendering
fix issue to create cl image from libva with non-zero offset
add test for clCreateImageFromLibvaIntel
fix issue to pass utest of runtime_climage_from_boname for BDW
clean code, the logic is already at the beginning of function
add test of cl_mem_use_host_ptr into benchmark
refine utest of cl_mem_use_host_ptr
enable CL_MEM_ALLOC_HOST_PTR with user_ptr to avoid copy between GPU/CPU
replace hash_map with map
do not include llvm/clang headers for libgbeinterp
change Immediate::operator= from private to public
do not use C++11 features inside libgbeinterp
fix utest build for some old gcc version
refine gbe_bin_generater usage to add -t option
remove useless dependency libocl
add option BUILD_STANDALONE_GBE_COMPILER to build static compiler
add CMake option USE_STANDALONE_GBE_COMPILER and STANDALONE_GBE_COMPILER_DIR
add utest of CL_MEM_ALLOC_HOST_PTR
only build tests that do not need compiler when standalone compiler is provided
add howto for old gcc version
Junyan He (1):
Fix the printf buffer size bug.
Luo Xionghu (12):
fix dnetc overflow issue.
fix bswap implementation issue.
refine bswap utest to cover nsetc fail cases.
refine overflow utest to cover nsetc fail cases.
disable overflow utest test before llvm-3.5
add half math function support.
fix max_parameter_size not correct on x86 platforms.
fix min_max_read_image_args and min_max_parameter_size issue.
add collectImageArgs to handle image count limitations.
reuse the loop info from llvm.
add the reduced self loop node detection.
fix the wrong implementation of popcount.
Lv Meng (1):
Fix a makefile bug for gcc is not the default compiler
Meng Mengmeng (2):
utests: make utests maths ULP values consistent with specification
add edge case detection for powr in utests
Ruiling Song (17):
GBE: Place loop exits after loop blocks when sorting basic blocks.
GBE: Re-implement BTI logic in backend
GBE: Fix the printf issue caused by new bti implementation
libocl: Fix precision of builtin tanpi.
libocl: Move spec required macro to header file.
libocl: Improve precision of pow/powr.
libocl: Imporve precision of exp()
libocl: Flush denorm input into zero in rootn()
libocl: flush denorm into zero in ldexp()
libocl: Correctly handle -inf in exp10.
libocl: flush denorm to zero in remquo()
GBE: support const private array initialization.
libocl: implement high precision pown()
libocl: remove useless code.
libocl: Reimplement trigonometric functions.
utests: Add const private array initialization test.
GBE: Fix a disassembly bug.
Yan Wang (4):
Fix based on piglit OpenCL falied case (cl-api-compile-program).
Fix delete operator using.
Fix PrintfState copying.
Fix loop condition of PrintfSet constructor.
Yang Rong (6):
Fix NO_TILING alignment bug.
BDW: Change the default tiling mode to TILING_Y on BDW.
Fix the opencv_test_core/OCL_Arithm random segment fault.
Change the IVB/HSW's max_work_group_size to 512, and BYT to 256.
Change the IVB/HSW L3 SQC credit setting.
Add read buffer/image benchmark.
Yang, Rong (1):
Separate flush and invalidate in function intel_gpgpu_pipe_control.
Zhenyu Wang (4):
Remove deprecated fulsim code
Add aub dump support
Use libdrm interface to get device id
Remove obsolete MI_FLUSH
Zhigang Gong (34):
utests: fix work group size issue in compiler_fill_image_2d_array.
utests: fix a typo in test cases.
utests: reduce work group size to 256 to satisfy BYT platform.
utests: fix indent in cmakelists.txt
GBE: Fix bug with negative constant GEP index.
utests: Add one case to test negative index array access.
GBE: fix a regression caused by the negative index handling patch.
GBE: optimize GEP constant offset calculation.
GBE: remove useless code.
GBE: eliminate duplicate GEP handling logic.
GBE: Add constant pointer in the memcpy intrinsic.
CL: Don't find mesa source code.
GBE: Add some missing constant expression cases.
Update optimization tips.
GBE: don't always treat a multiple destination instruction as root.
Refactor all image builtin functions.
GBE: switch to use CLANG native image types.
GBE: switch to CLANG native sampler_t.
GBE: remove some image1d_buffer related builtin functions.
GBE/CL: use 2D image to implement large image1D_buffer.
GBE: code cleanup.
GBE: fix an image regression.
GBE: use sr0.1's SLM Offset to eliminate the software SLM offset for HSW.
GBE: remove software maintained SLM offset related code.
utests: reduce test count.
runtime: tweak max memory allocation size.
runtime: fix max work group size for IVBGT1.
Don't check some edge condtion in non-strict mode.
CL/Driver: enable atomics in L3 for HSW.
CL/Driver: quick fix regression caused by remove MI_FLUSH.
utests: skip one test when it fail to open XDisplay.
CL/Driver/HSW: Convert L3 cycle for texture to uncachable.
GBE: disable spill register under simd16 mode.
Bump version to 1.0.1.
Zhu Bingbing (1):
change the utest summary code
More information about the Beignet
mailing list