[Beignet] [ANNOUNCE] Beignet 1.0.0 (2014-11-14)
Igor Gnatenko
i.gnatenko.brain at gmail.com
Mon Nov 17 12:11:54 PST 2014
Hi,
cool release! I've updated beignet to 1.0.0 for Fedora 20, 21 and
Rawhide (22) today. Let's use it!
https://bugzilla.redhat.com/show_bug.cgi?id=1142892
On Fri, Nov 14, 2014 at 9:09 AM, Zhigang Gong
<zhigang.gong at linux.intel.com> wrote:
> Beignet 1.0.0 (2014-11-14)
> =========================
>
> Beignet development team is proud to announce that Beignet 1.0.0
> has been released. This is an important milestone after about two
> years of development. Thanks for everyone who helped us to improve
> it to relatively mature state.
>
> Now beignet supports from 3rd to 5th Generation Intel Core Processors.
> Besides the Broadwell support, this release also bring major performance
> improvement for many workloads and fixed some bugs. We observed 10% to
> more than 4x performance gain for some OpenCV 3.0 benchmarks.
>
> The highlighted items are as below:
>
> 1. Added 5th generation Intel Core Processors (BDW) support.
> 2. Optimized constant buffer load.
> 3. Implement basic transformation from unstructurized control flow to
> structurized control flow to improve performance.
> 4. Fixed some memory leak bugs.
> 5. Implemented missing constant expression handling.
> 6. Added Clang/ICC compiler support for Beignet build.
> 7. Optimized unaligned char/short vector load.
> 8. Speed up kernel compiling time by move built-in functions support
> from header file into linked library.
> 9. Implemented some missing llvm intrinsics.
> 10. Optimized loop unrolling pass, boosted some OpenCV benchmarks.
> 11. Several other bug fixes since last release. For OpenCV 3.0 /
> OpenCV 2.4/piglit test suite, Beignet's pass rates are all
> above 99%.
>
> Git tag: Release_v1.0.0
> Gitweb URL: http://cgit.freedesktop.org/beignet
> https://01.org/sites/default/files/beignet-1.0.0-source.tar.gz
>
> md5sum: bfd755904c332cdd285d6058f5f3de8c Beignet-1.0.0-Source.tar.gz
> sha1sum: a2b0eb53e5f9a6055cd656531532a4c6ae03fbb0 Beignet-1.0.0-Source.tar.gz
> sha256sum: e30c4d0f4c8917fa0df2467b2d70a4ee524f28d54c42c582262d5f08928ea543 Beignet-1.0.0-Source.tar.gz
>
> -----------------------------------------------------------------
>
> Changes since 0.9.3:
>
> Andreas Beckmann (2):
> fix some typos
> use env to set environment variables for GBE_BIN_GENERATER
>
> Chuanbo Weng (1):
> utest: add new test that trigger an assignment operation bug in if.
>
> Guo Yejun (18):
> remove requirment as drm master in non-x environment
> remove requirment as drm master in non-x environment
> free build_log when the cl program is released
> free build_log when the cl program is released
> fix three memory leaks
> clean llvm resource in compiler (libgbe.so)
> fix three memory leaks
> clean llvm resource in compiler (libgbe.so)
> delete GEPInst when it is no longer used
> delete GEPInst when it is no longer used
> remove dependency for non-X runtime environment
> remove dependency for non-X runtime environment
> support CL_MEM_USE_HOST_PTR with userptr for cl buffer
> enable CL_DEVICE_HOST_UNIFIED_MEMORY when userptr is supported
> add test for cl buffer created with CL_MEM_USE_HOST_PTR
> fix issue to create cl image from libva with non-zero offset
> add test for clCreateImageFromLibvaIntel
> use posix_memalign instead of aligned_alloc to be more compatible
>
> Junyan He (54):
> Fix the global string bug for printf.
> Fix a bug for runtime_barrier_list.cpp, event array out of bound
> Fix a bug for runtime_barrier_list.cpp, event array out of bound
> Fix the global string bug for printf.
> Add common define header files to initialize the libocl
> Add the async module into the libocl
> Add the atomic module into the libocl
> Add the geometric module into the libocl
> Add the image module into the libocl
> Add the misc module into the libocl
> Add the sync module into the libocl
> Add printf module into libocl
> Add vload module into the libocl
> Add thw workitem module into the libocl
> Add the convert and as modules into the libocl
> Add the gen_vector script into the libocl
> Add the common module into the libocl as template
> Add the integer module into libocl as template
> Add the math function into libocl as template
> Add the relational module into libocl as template
> Add the ocl_defines header file into libocl
> Add memcpy, memset and barrier bitcode files into libocl
> Add the bit code linker into the module pass.
> Enable libocl and disable the usage of the old huge header.
> Use the PCH to accelerate the parsing speed of the ocl.h
> Delete all the unused files of old huge header.
> Add the missing function prototypes of any() and atom_add()
> Add uncompatible PCH Options to avoid compiling failure.
> Fix the global string bug for printf.
> Add copyright header for all libocl files.
> Fix the issue of -cl-std=CLX.X option.
> Fix the issue of -cl-std=CLX.X option.
> Add the switch logic for math conformance fast path
> Modify the CMakeList to use the internal PCH first.
> Fix the bug of LLVM_LFLAGS fail to set
> Add long support for printf
> BDW: Add gen8 surface state struct.
> BDW: refine the gen8_surface_state_t.
> BDW: Add function intel_gpgpu_setup_bti for gen8.
> BDW: Correct surface base address set in setup bti.
> BDW: Add function intel_gpgpu_bind_buf for gen8.
> Add sampler state and tile define for gen8.
> Modify the bind sampler logic for gen8
> BDW: Add gen8 into intel_driver_init
> Refine the shared function ID define.
> Add the libdrm version check.
> Let the failure of intel_drm lib's check as a FATAL_ERROR
> Fit the printf bug in loop
> Fix the bug of 1D array slice pitch
> Add the test case for image 1d array fill
> Add the test case for image 2d array fill
> Add the disasm support for Gen8
> Fix the compare_image_2d_and_1d_array test case bug
> Fix the bug of multi-thread crash
>
> Luo (5):
> remove lspci, gbe_bin_genenrater would generator llvm binary by default.
> remove lspci, gbe_bin_genenrater would generator llvm binary by default.
> fix piglit get kernel info FUNCTION ATTRIBUTE fail.
> fix piglit get kernel info FUNCTION ATTRIBUTE fail.
> add opencl-1.2 builtin function popcount.
>
> Luo Xionghu (28):
> fix the relational built-in vector function regression.
> fix opencv_test_imgproc subcase OCL_ImgProc/Accumulate.Mask regression.
> fix piglit cl-api-get-program-info fail.
> fix piglit cl-api-get-program-info fail.
> fix clGetKernelWorkGroupInfo built-in kernel fail.
> fix piglit cl-api-set-kernel-arg fail.
> fix clGetKernelWorkGroupInfo built-in kernel fail.
> fix piglit cl-api-set-kernel-arg fail.
> fix bin/cl-program-tester tests/cl/program/execute/attributes.cl regression.
> fix bin/cl-program-tester tests/cl/program/execute/attributes.cl regression.
> remove the LinkOnceAnyLinkage since the libocl is introduced.
> improve the build performance of vector type built-in function.
> fix one bug at cl_get_kernel_workgroup_info.
> fix utest memory leak.
> Add Gen IR WHILE.
> add handleSelfLoopNode to insert while instruction on Gen IR level.
> Use instruction WHILE to manipulate structure.
> add utest popcount for all types.
> use global flag 0.0 to control unstructured simple block.
> add llvm Intrinsic call support.
> add utest compiler_overflow for llvm intrinsic function.
> enable llvm intrinsic call usub_with_overflow funtion.
> add utest for llvm intrinsic call usub_with_overflow funtion.
> enable llvm intrinsic call bswap function.
> add utest function bswap.
> fix bswap kernel function type issue.
> fix piglit clCreateProgramWithBinary fail.
> fix a bug in clCompileProgram().
>
> LuoXionghu (5):
> add platform info in the gen binary code.
> add utest load_program_from_gen_bin.
> add platform info in the gen binary code.
> add utest load_program_from_gen_bin.
> improve the build performance of vector type built-in function.
>
> Lv Meng (6):
> improve the clEnqueueCopyBufferRect performance in some cases
> Fix compile error for ICC compiler
> Fix compile errors for CLANG compiler
> Fix compile warnings for ICC compiler
> Fix compile warnings for CLANG compiler
> Enable ICC and CLANG compiler for beignet
>
> Meng Mengmeng (3):
> add beignet GIT_HAL1 if there is .git directory
> create GIT_SHA1 without any dependency
> add building dependency GIT_SHA1
>
> Rebecca Palmer (7):
> Fail gracefully on unsupported hardware
> Fail gracefully on unsupported hardware
> GBE: fix bug in pow()/pown().
> GBE: fix bug in erf()/erfc().
> GBE: fix bug in tgamma().
> utests: fix bugs in builtin_pow().
> utests: fix bugs in builtin_tgamma().
>
> Ruiling Song (43):
> GBE: Fix builtin tanpi.
> GBE: Fix builtin tanpi.
> GBE: Use varying register to save one instruction
> GBE: Optimize constant load with sampler.
> GBE: align the fields in union ImageInfoKey.
> utests: Fix a bug in image_1D_buffer.
> GBE: align the fields in union ImageInfoKey.
> utests: Fix a bug in image_1D_buffer.
> runtime: set correct state for constant buffer on hsw.
> runtime: set correct state for constant buffer on hsw.
> GBE: Refine bti usage in backend & runtime.
> GBE: Handle bti allocation for internal buffer used by printf.
> GBE: remove some useless code for getting printf buffer address.
> GBE: Fix a warning in getConstantPointerRegister.
> GBE: Fix type size for vector3
> GBE: initialize BTI structure to zero.
> GBE: Fix a bug in gatherBTI.
> cmake: Fix a license issue.
> GBE: clear deadprintfs when current function is done.
> GBE: refine the llvm multi-thread related code.
> GBE: Fix type size for vector3
> cmake: Fix a license issue.
> GBE: clear deadprintfs when current function is done.
> GBE: refine the llvm multi-thread related code.
> GBE: Optimize constant load with sampler.
> GBE: Refine bti usage in backend & runtime.
> GBE: Handle bti allocation for internal buffer used by printf.
> GBE: initialize BTI structure to zero.
> GBE: Fix a bug in gatherBTI.
> GBE/libocl: Fix sub_sat corner case.
> GBE: Fix sub_sat corner case.
> GBE: Output linkModules's error message.
> GBE/libocl: Add __gen_ocl_get_timestamp() to get timestamp.
> GBE: Fix a bug when setting flag register
> GBE: add legalize pass to handle wide integers
> Re-apply "improve the build performance of vector type built-in function."
> GBE: workaround register allocation fail caused by custom loop unroll.
> GBE: Fix live range for temporary register in replaceReg
> GBE: Fix kernel argument size for vector3
> utests: add a test to trigger cl_float3 bug in clSetKernelArg.
> GBE: Fix a bitcast from float vector to wide interger issue in legalize pass.
> GBE: Do topological sorting of basicblocks.
> docs: update mixed_buffer_pointer document.
>
> Yang Rong (54):
> Add some hsw missed pci ids (reserved PCI IDs).
> Add some hsw missed pci ids (reserved PCI IDs).
> Fix a utest compiler_async_stride_copy typo.
> Fix a utest compiler_async_stride_copy typo.
> Only compiler X11 files and do X11 operations when found X11.
> Only compiler X11 files and do X11 operations when found X11.
> Update Beignet.mdwn X11 dependency.
> Two minor fix.
> Fix two bugs.
> Update Beignet.mdwn X11 dependency.
> Two minor fix.
> Fix two bugs.
> Update README for the command parser in drm kernel.
> Update README for the command parser in drm kernel.
> Update license disclaimer.
> Update license disclaimer.
> Avoid use GenNativeInstruction directly out of GenEncode and gen_insn_compact.
> BDW: Add BDW pci ids and BDW device struct.
> BDW: Add BDW instruction define.
> BDW: Add Gen8Encoder and Gen7Encoder.
> BDW: Add class Gen8Context.
> BDW: Pass Jip and Uip when patchJMPI.
> BDW: Refine intel_gpgpu_setup_bti and add intel_gpgpu_set_base_address for BDW.
> BDW: add some BDW function.
> BDW: Fix Pointer argument curbe alloce size.
> BDW: enable SLM in BDW.
> BDW: Fix unsample bug.
> BDW: Refine BDW's int 32*32 multiply.
> BDW: BDW don't need add slm offset, remove it.
> BDW: Add BDW Device id to gen binary generater and binary serialize in backend.
> BDW: Add device's sub slice field, for cl_get_kernel_max_wg_sz.
> BDW: Correct scratch buffer of BDW.
> BDW: Forgot to set UIP of else in BDW.
> BDW: Correct BDW device name.
> BDW: Fix a scaler int 32*32 bug.
> BDW: Need not restore SLM setting in BDW.
> BDW: Correct stack setting in BDW.
> Fix a segment fault.
> Fix a HSW regression.
> Fix memcpy and memset bug.
> Fix HSW thread_n <= 64 assert.
> Fix a HSW constant buffer regression.
> BDW: Change BDW's max work group size to 512.
> BDW: Fix load/store half error.
> BDW: Also need set Shader Channel Select for constant buffer in BDW.
> Fix a upsample regression.
> Fix a HSW regression.
> Refine the the error handling in function cl_command_queue_ND_range_gen7.
> Refine the intel gpgpu delete.
> Fix a size assert when setup bti.
> BDW: Fix bwd 32*32 scalar multiplication bug.
> IVB/HSW/BYT: Revert the Dynamic state Base Addr and relative buffers address setting.
> BDW: Set the URB/REST size to 384K/384K when SLM disable.
> BDW: Change the default tiling mode to TILING_Y on BDW.
>
> Yichao Yu (1):
> Use ${PYTHON_EXECUTABLE} to run python scripts.
>
> Yongjia Zhang (6):
> Add Gen IR IF, ELSE and ENDIF
> Add Gen instruction 'else'
> Add structure identification on ir level
> Use instruction if else and endif manipulate structures
> Enable structural analysis
> GBE: fix empty block disassemble bug.
>
> Zhenyu Wang (5):
> Make use of write enable flag for mem bo map
> Clear batch buffer pointer after unmap
> Use pread/pwrite for buffer enqueue read/write
> Fix AUX buffer for page alignment
> Remove intel_gpgpu_check_binded_buf_address()
>
> Zhigang Gong (111):
> Build: Change versioning policy.
> runtime/driver: refine error handlings.
> runtime: fix some subtle event bugs.
> runtime/driver: refine error handlings.
> runtime: fix some subtle event bugs.
> gbe: add the new else instruction to the assert checking.
> docs: add a NEWS document to point to the release notes pages.
> docs: add a NEWS document to point to the release notes pages.
> Bump to 0.9.2.
> NEWS: update for 0.9.2.
> GBE: cleanup image base index related code.
> GBE: refine post register allocation scheduling for global buffers.
> GBE: refactor the immediate class to support vector data type.
> GBE: simplify processConstant.
> GBE: complete constant expression processing.
> GBE: enable constant expression processing.
> utest: add new test for constant expression processing.
> GBE: Reduce random behaviour of the code generation
> GBE: adjust preferred vector length.
> GBE: refactor the immediate class to support vector data type.
> GBE: simplify processConstant.
> GBE: complete constant expression processing.
> GBE: enable constant expression processing.
> utest: add new test for constant expression processing.
> Revert "GBE: refine post register allocation scheduling for global buffers."
> utests: fix two utest bugs.
> GBE: fix error in the rootn fastpath function for some special input.
> utests: fix two utest bugs.
> GBE: fix error in the rootn fastpath function for some special input.
> Add new vload benchmark/test case.
> GBE: optimize unaligned char and short data vector's load.
> GBE: relax the batch byte/short load vector size restrication.
> GBE: refine the unaligned data gathering.
> GBE: adjust preferred vector length.
> GBE: fixup/refine a bug for image1D array's extra binding index handling.
> GBE: remove the user defined macro cl_khr_fp64.
> GBE: avoid one optimization pass to generate wide integer.
> GBE: avoid one optimization pass to generate wide integer.
> GBE: fix a bug with LLVM 3.3.
> GBE: fallback if we get a wider than i64 constant.
> GBE: fix a bug with LLVM 3.3.
> GBE: fallback if we get a wider than i64 constant.
> GBE: cleanup image base index related code.
> GBE: fixup/refine a bug for image1D array's extra binding index handling.
> build: fix a CXXFLAGS override bug in backend directory.
> GBE: fix some predfeined OCL macros.
> Runtime: Implement clGetExtensionFunctionAddressForPlatform.
> Runtime: Implement clGetExtensionFunctionAddressForPlatform.
> GBE/libocl: fix the wrong prototype of scalar native_powr.
> GBE: fix bugs when handling -cl-std option.
> GBE: fix bugs when handling -cl-std option.
> GBE/libocl: Added one missing prototype fma().
> GBE: don't return error if we get an empty module.
> GBE: Fix a potential segfault.
> GBE: Fix a potential segfault.
> GBE: fix a potential memory leak bug.
> GBE: fix a potential memory leak bug.
> GBE: don't enable double by default.
> GBE: don't enable double by default.
> GBE: fix multiple files compilation bugs.
> runtime: fix program binary type bug.
> runtime: fix build status handling.
> runtime: fix program binary type bug.
> runtime: fix build status handling.
> GBE: fix multiple files compilation bugs.
> Update readme.
> Update readme.
> Document fixup.
> Remove out-of-date document.
> Bump to 0.9.3.
> Remove out-of-date document.
> Update NEWS.
> GBE/libocl: add missing vector builtin definition for fma.
> GBE/libocl: fix a regression after libocl change.
> Revert "improve the build performance of vector type built-in function."
> GBE/libocl: fix build dependency issue.
> GBE: fix a loop header file including bug.
> GBE: structurized loop exit need an extra branching instruction when do reordering.
> GBE: fix a bug in legalize pass.
> GBE: do intrinsics lowering pass earlier.
> GBE: fix a legalize pass bug when bitcast wide integer to incompaitble vector.
> GBE: Add a customized loop unrolling handling mechanism.
> GBE: disable custom loop unroll for LLVM 3.3/3.4.
> GBE: add Selection instruction handler at legalize pass.
> GBE: increase maximum src/dst operands to 32.
> GBE: add basic PHINode support in legalize pass.
> GBE: fix regression caused by simple block optimization.
> GBE: handle dead loop BBs in liveness analysis.
> GBE: set default address space to -1 to avoid incorrect unroll hint.
> GBE: fix a wrong type of cl_device_info.
> utest: change the box_blur_image to be identical to box_blur.
> utests: replace the nodistriutable picture.
> GBE: fix disassembly bug.
> GBE: fix a bool handling bug when SEL on a uniform bool variable.
> GBE: Support more instructions for constant expression handling.
> GBE: remove useless debug info.
> Revert "add test for clCreateImageFromLibvaIntel"
> Revert "fix issue to create cl image from libva with non-zero offset"
> utests: remove all shader toy test cases.
> License: adjust all license version to LGPL v2.1+.
> GBE: fix relocatable issue for pch file.
> Revert "BDW: Change the default tiling mode to TILING_Y on BDW."
> GBE: fix one double related bugs for post register scheduling.
> update some documents.
> runtime: fix one bug in BDW image.
> Update documents.
> runtime: refine version handling.
> runtime: fix bug in cl_enqueue_read_buffer.
> runtime: disable userptr due to random fail.
> GBE: work around error reporting for unresolved symbols
> Bump to 1.0.0.
>
>
> --
> Zhigang Gong,
> Thanks.
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet
--
-Igor Gnatenko
More information about the Beignet
mailing list