[Beignet] [ANNOUNCE] Beignet 1.0.0 (2014-11-14)

Igor Gnatenko i.gnatenko.brain at gmail.com
Mon Nov 17 12:11:54 PST 2014


Hi,

cool release! I've updated beignet to 1.0.0 for Fedora 20, 21 and
Rawhide (22) today. Let's use it!

https://bugzilla.redhat.com/show_bug.cgi?id=1142892

On Fri, Nov 14, 2014 at 9:09 AM, Zhigang Gong
<zhigang.gong at linux.intel.com> wrote:
> Beignet 1.0.0 (2014-11-14)
> =========================
>
> Beignet development team is proud to announce that Beignet 1.0.0
> has been released. This is an important milestone after about two
> years of development. Thanks for everyone who helped us to improve
> it to relatively mature state.
>
> Now beignet supports from 3rd to 5th Generation Intel Core Processors.
> Besides the Broadwell support, this release also bring major performance
> improvement for many workloads and fixed some bugs. We observed 10% to
> more than 4x performance gain for some OpenCV 3.0 benchmarks.
>
> The highlighted items are as below:
>
> 1. Added 5th generation Intel Core Processors (BDW) support.
> 2. Optimized constant buffer load.
> 3. Implement basic transformation from unstructurized control flow to
>    structurized control flow to improve performance.
> 4. Fixed some memory leak bugs.
> 5. Implemented missing constant expression handling.
> 6. Added Clang/ICC compiler support for Beignet build.
> 7. Optimized unaligned char/short vector load.
> 8. Speed up kernel compiling time by move built-in functions support
>    from header file into linked library.
> 9. Implemented some missing llvm intrinsics.
> 10. Optimized loop unrolling pass, boosted some OpenCV benchmarks.
> 11. Several other bug fixes since last release. For OpenCV 3.0 /
>     OpenCV 2.4/piglit test suite, Beignet's pass rates are all
>     above 99%.
>
> Git tag: Release_v1.0.0
> Gitweb URL: http://cgit.freedesktop.org/beignet
> https://01.org/sites/default/files/beignet-1.0.0-source.tar.gz
>
> md5sum: bfd755904c332cdd285d6058f5f3de8c  Beignet-1.0.0-Source.tar.gz
> sha1sum: a2b0eb53e5f9a6055cd656531532a4c6ae03fbb0  Beignet-1.0.0-Source.tar.gz
> sha256sum: e30c4d0f4c8917fa0df2467b2d70a4ee524f28d54c42c582262d5f08928ea543  Beignet-1.0.0-Source.tar.gz
>
> -----------------------------------------------------------------
>
> Changes since 0.9.3:
>
> Andreas Beckmann (2):
>       fix some typos
>       use env to set environment variables for GBE_BIN_GENERATER
>
> Chuanbo Weng (1):
>       utest: add new test that trigger an assignment operation bug in if.
>
> Guo Yejun (18):
>       remove requirment as drm master in non-x environment
>       remove requirment as drm master in non-x environment
>       free build_log when the cl program is released
>       free build_log when the cl program is released
>       fix three memory leaks
>       clean llvm resource in compiler (libgbe.so)
>       fix three memory leaks
>       clean llvm resource in compiler (libgbe.so)
>       delete GEPInst when it is no longer used
>       delete GEPInst when it is no longer used
>       remove dependency for non-X runtime environment
>       remove dependency for non-X runtime environment
>       support CL_MEM_USE_HOST_PTR with userptr for cl buffer
>       enable CL_DEVICE_HOST_UNIFIED_MEMORY when userptr is supported
>       add test for cl buffer created with CL_MEM_USE_HOST_PTR
>       fix issue to create cl image from libva with non-zero offset
>       add test for clCreateImageFromLibvaIntel
>       use posix_memalign instead of aligned_alloc to be more compatible
>
> Junyan He (54):
>       Fix the global string bug for printf.
>       Fix a bug for runtime_barrier_list.cpp, event array out of bound
>       Fix a bug for runtime_barrier_list.cpp, event array out of bound
>       Fix the global string bug for printf.
>       Add common define header files to initialize the libocl
>       Add the async module into the libocl
>       Add the atomic module into the libocl
>       Add the geometric module into the libocl
>       Add the image module into the libocl
>       Add the misc module into the libocl
>       Add the sync module into the libocl
>       Add printf module into libocl
>       Add vload module into the libocl
>       Add thw workitem module into the libocl
>       Add the convert and as modules into the libocl
>       Add the gen_vector script into the libocl
>       Add the common module into the libocl as template
>       Add the integer module into libocl as template
>       Add the math function into libocl as template
>       Add the relational module into libocl as template
>       Add the ocl_defines header file into libocl
>       Add memcpy, memset and barrier bitcode files into libocl
>       Add the bit code linker into the module pass.
>       Enable libocl and disable the usage of the old huge header.
>       Use the PCH to accelerate the parsing speed of the ocl.h
>       Delete all the unused files of old huge header.
>       Add the missing function prototypes of any() and atom_add()
>       Add uncompatible PCH Options to avoid compiling failure.
>       Fix the global string bug for printf.
>       Add copyright header for all libocl files.
>       Fix the issue of -cl-std=CLX.X option.
>       Fix the issue of -cl-std=CLX.X option.
>       Add the switch logic for math conformance fast path
>       Modify the CMakeList to use the internal PCH first.
>       Fix the bug of LLVM_LFLAGS fail to set
>       Add long support for printf
>       BDW: Add gen8 surface state struct.
>       BDW: refine the gen8_surface_state_t.
>       BDW: Add function intel_gpgpu_setup_bti for gen8.
>       BDW: Correct surface base address set in setup bti.
>       BDW: Add function intel_gpgpu_bind_buf for gen8.
>       Add sampler state and tile define for gen8.
>       Modify the bind sampler logic for gen8
>       BDW: Add gen8 into intel_driver_init
>       Refine the shared function ID define.
>       Add the libdrm version check.
>       Let the failure of intel_drm lib's check as a FATAL_ERROR
>       Fit the printf bug in loop
>       Fix the bug of 1D array slice pitch
>       Add the test case for image 1d array fill
>       Add the test case for image 2d array fill
>       Add the disasm support for Gen8
>       Fix the compare_image_2d_and_1d_array test case bug
>       Fix the bug of multi-thread crash
>
> Luo (5):
>       remove lspci, gbe_bin_genenrater would generator llvm binary by default.
>       remove lspci, gbe_bin_genenrater would generator llvm binary by default.
>       fix piglit get kernel info FUNCTION ATTRIBUTE fail.
>       fix piglit get kernel info FUNCTION ATTRIBUTE fail.
>       add opencl-1.2 builtin function popcount.
>
> Luo Xionghu (28):
>       fix the relational built-in vector function regression.
>       fix opencv_test_imgproc subcase OCL_ImgProc/Accumulate.Mask regression.
>       fix piglit cl-api-get-program-info fail.
>       fix piglit cl-api-get-program-info fail.
>       fix clGetKernelWorkGroupInfo built-in kernel fail.
>       fix piglit cl-api-set-kernel-arg fail.
>       fix clGetKernelWorkGroupInfo built-in kernel fail.
>       fix piglit cl-api-set-kernel-arg fail.
>       fix bin/cl-program-tester tests/cl/program/execute/attributes.cl regression.
>       fix bin/cl-program-tester tests/cl/program/execute/attributes.cl regression.
>       remove the LinkOnceAnyLinkage since the libocl is introduced.
>       improve the build performance of vector type built-in function.
>       fix one bug at cl_get_kernel_workgroup_info.
>       fix utest memory leak.
>       Add Gen IR WHILE.
>       add handleSelfLoopNode to insert while instruction on Gen IR level.
>       Use instruction WHILE to manipulate structure.
>       add utest popcount for all types.
>       use global flag 0.0 to control unstructured simple block.
>       add llvm Intrinsic call support.
>       add utest compiler_overflow for llvm intrinsic function.
>       enable llvm intrinsic call usub_with_overflow funtion.
>       add utest for llvm intrinsic call usub_with_overflow funtion.
>       enable llvm intrinsic call bswap function.
>       add utest function bswap.
>       fix bswap kernel function type issue.
>       fix piglit clCreateProgramWithBinary fail.
>       fix a bug in clCompileProgram().
>
> LuoXionghu (5):
>       add platform info in the gen binary code.
>       add utest load_program_from_gen_bin.
>       add platform info in the gen binary code.
>       add utest load_program_from_gen_bin.
>       improve the build performance of vector type built-in function.
>
> Lv Meng (6):
>       improve the clEnqueueCopyBufferRect performance in some cases
>       Fix compile error for ICC compiler
>       Fix compile errors for CLANG compiler
>       Fix compile warnings for ICC compiler
>       Fix compile warnings for CLANG compiler
>       Enable ICC and CLANG compiler for beignet
>
> Meng Mengmeng (3):
>       add beignet GIT_HAL1 if there is .git directory
>       create GIT_SHA1 without any dependency
>       add building dependency GIT_SHA1
>
> Rebecca Palmer (7):
>       Fail gracefully on unsupported hardware
>       Fail gracefully on unsupported hardware
>       GBE: fix bug in pow()/pown().
>       GBE: fix bug in erf()/erfc().
>       GBE: fix bug in tgamma().
>       utests: fix bugs in builtin_pow().
>       utests: fix bugs in builtin_tgamma().
>
> Ruiling Song (43):
>       GBE: Fix builtin tanpi.
>       GBE: Fix builtin tanpi.
>       GBE: Use varying register to save one instruction
>       GBE: Optimize constant load with sampler.
>       GBE: align the fields in union ImageInfoKey.
>       utests: Fix a bug in image_1D_buffer.
>       GBE: align the fields in union ImageInfoKey.
>       utests: Fix a bug in image_1D_buffer.
>       runtime: set correct state for constant buffer on hsw.
>       runtime: set correct state for constant buffer on hsw.
>       GBE: Refine bti usage in backend & runtime.
>       GBE: Handle bti allocation for internal buffer used by printf.
>       GBE: remove some useless code for getting printf buffer address.
>       GBE: Fix a warning in getConstantPointerRegister.
>       GBE: Fix type size for vector3
>       GBE: initialize BTI structure to zero.
>       GBE: Fix a bug in gatherBTI.
>       cmake: Fix a license issue.
>       GBE: clear deadprintfs when current function is done.
>       GBE: refine the llvm multi-thread related code.
>       GBE: Fix type size for vector3
>       cmake: Fix a license issue.
>       GBE: clear deadprintfs when current function is done.
>       GBE: refine the llvm multi-thread related code.
>       GBE: Optimize constant load with sampler.
>       GBE: Refine bti usage in backend & runtime.
>       GBE: Handle bti allocation for internal buffer used by printf.
>       GBE: initialize BTI structure to zero.
>       GBE: Fix a bug in gatherBTI.
>       GBE/libocl: Fix sub_sat corner case.
>       GBE: Fix sub_sat corner case.
>       GBE: Output linkModules's error message.
>       GBE/libocl: Add __gen_ocl_get_timestamp() to get timestamp.
>       GBE: Fix a bug when setting flag register
>       GBE: add legalize pass to handle wide integers
>       Re-apply "improve the build performance of vector type built-in function."
>       GBE: workaround register allocation fail caused by custom loop unroll.
>       GBE: Fix live range for temporary register in replaceReg
>       GBE: Fix kernel argument size for vector3
>       utests: add a test to trigger cl_float3 bug in clSetKernelArg.
>       GBE: Fix a bitcast from float vector to wide interger issue in legalize pass.
>       GBE: Do topological sorting of basicblocks.
>       docs: update mixed_buffer_pointer document.
>
> Yang Rong (54):
>       Add some hsw missed pci ids (reserved PCI IDs).
>       Add some hsw missed pci ids (reserved PCI IDs).
>       Fix a utest compiler_async_stride_copy typo.
>       Fix a utest compiler_async_stride_copy typo.
>       Only compiler X11 files and do X11 operations when found X11.
>       Only compiler X11 files and do X11 operations when found X11.
>       Update Beignet.mdwn X11 dependency.
>       Two minor fix.
>       Fix two bugs.
>       Update Beignet.mdwn X11 dependency.
>       Two minor fix.
>       Fix two bugs.
>       Update README for the command parser in drm kernel.
>       Update README for the command parser in drm kernel.
>       Update license disclaimer.
>       Update license disclaimer.
>       Avoid use GenNativeInstruction directly out of GenEncode and gen_insn_compact.
>       BDW: Add BDW pci ids and BDW device struct.
>       BDW: Add BDW instruction define.
>       BDW: Add Gen8Encoder and Gen7Encoder.
>       BDW: Add class Gen8Context.
>       BDW: Pass Jip and Uip when patchJMPI.
>       BDW: Refine intel_gpgpu_setup_bti and add intel_gpgpu_set_base_address for BDW.
>       BDW: add some BDW function.
>       BDW: Fix Pointer argument curbe alloce size.
>       BDW: enable SLM in BDW.
>       BDW: Fix unsample bug.
>       BDW: Refine BDW's int 32*32 multiply.
>       BDW: BDW don't need add slm offset, remove it.
>       BDW: Add BDW Device id to gen binary generater and binary serialize in backend.
>       BDW: Add device's sub slice field, for cl_get_kernel_max_wg_sz.
>       BDW: Correct scratch buffer of BDW.
>       BDW: Forgot to set UIP of else in BDW.
>       BDW: Correct BDW device name.
>       BDW: Fix a scaler int 32*32 bug.
>       BDW: Need not restore SLM setting in BDW.
>       BDW: Correct stack setting in BDW.
>       Fix a segment fault.
>       Fix a HSW regression.
>       Fix memcpy and memset bug.
>       Fix HSW thread_n <= 64 assert.
>       Fix a HSW constant buffer regression.
>       BDW: Change BDW's max work group size to 512.
>       BDW: Fix load/store half error.
>       BDW: Also need set Shader Channel Select for constant buffer in BDW.
>       Fix a upsample regression.
>       Fix a HSW regression.
>       Refine the the error handling in function cl_command_queue_ND_range_gen7.
>       Refine the intel gpgpu delete.
>       Fix a size assert when setup bti.
>       BDW: Fix bwd 32*32 scalar multiplication bug.
>       IVB/HSW/BYT: Revert the Dynamic state Base Addr and relative buffers address setting.
>       BDW: Set the URB/REST size to 384K/384K when SLM disable.
>       BDW: Change the default tiling mode to TILING_Y on BDW.
>
> Yichao Yu (1):
>       Use ${PYTHON_EXECUTABLE} to run python scripts.
>
> Yongjia Zhang (6):
>       Add Gen IR IF, ELSE and ENDIF
>       Add Gen instruction 'else'
>       Add structure identification on ir level
>       Use instruction if else and endif manipulate structures
>       Enable structural analysis
>       GBE: fix empty block disassemble bug.
>
> Zhenyu Wang (5):
>       Make use of write enable flag for mem bo map
>       Clear batch buffer pointer after unmap
>       Use pread/pwrite for buffer enqueue read/write
>       Fix AUX buffer for page alignment
>       Remove intel_gpgpu_check_binded_buf_address()
>
> Zhigang Gong (111):
>       Build: Change versioning policy.
>       runtime/driver: refine error handlings.
>       runtime: fix some subtle event bugs.
>       runtime/driver: refine error handlings.
>       runtime: fix some subtle event bugs.
>       gbe: add the new else instruction to the assert checking.
>       docs: add a NEWS document to point to the release notes pages.
>       docs: add a NEWS document to point to the release notes pages.
>       Bump to 0.9.2.
>       NEWS: update for 0.9.2.
>       GBE: cleanup image base index related code.
>       GBE: refine post register allocation scheduling for global buffers.
>       GBE: refactor the immediate class to support vector data type.
>       GBE: simplify processConstant.
>       GBE: complete constant expression processing.
>       GBE: enable constant expression processing.
>       utest: add new test for constant expression processing.
>       GBE: Reduce random behaviour of the code generation
>       GBE: adjust preferred vector length.
>       GBE: refactor the immediate class to support vector data type.
>       GBE: simplify processConstant.
>       GBE: complete constant expression processing.
>       GBE: enable constant expression processing.
>       utest: add new test for constant expression processing.
>       Revert "GBE: refine post register allocation scheduling for global buffers."
>       utests: fix two utest bugs.
>       GBE: fix error in the rootn fastpath function for some special input.
>       utests: fix two utest bugs.
>       GBE: fix error in the rootn fastpath function for some special input.
>       Add new vload benchmark/test case.
>       GBE: optimize unaligned char and short data vector's load.
>       GBE: relax the batch byte/short load vector size restrication.
>       GBE: refine the unaligned data gathering.
>       GBE: adjust preferred vector length.
>       GBE: fixup/refine a bug for image1D array's extra binding index handling.
>       GBE: remove the user defined macro cl_khr_fp64.
>       GBE: avoid one optimization pass to generate wide integer.
>       GBE: avoid one optimization pass to generate wide integer.
>       GBE: fix a bug with LLVM 3.3.
>       GBE: fallback if we get a wider than i64 constant.
>       GBE: fix a bug with LLVM 3.3.
>       GBE: fallback if we get a wider than i64 constant.
>       GBE: cleanup image base index related code.
>       GBE: fixup/refine a bug for image1D array's extra binding index handling.
>       build: fix a CXXFLAGS override bug in backend directory.
>       GBE: fix some predfeined OCL macros.
>       Runtime: Implement clGetExtensionFunctionAddressForPlatform.
>       Runtime: Implement clGetExtensionFunctionAddressForPlatform.
>       GBE/libocl: fix the wrong prototype of scalar native_powr.
>       GBE: fix bugs when handling -cl-std option.
>       GBE: fix bugs when handling -cl-std option.
>       GBE/libocl: Added one missing prototype fma().
>       GBE: don't return error if we get an empty module.
>       GBE: Fix a potential segfault.
>       GBE: Fix a potential segfault.
>       GBE: fix a potential memory leak bug.
>       GBE: fix a potential memory leak bug.
>       GBE: don't enable double by default.
>       GBE: don't enable double by default.
>       GBE: fix multiple files compilation bugs.
>       runtime: fix program binary type bug.
>       runtime: fix build status handling.
>       runtime: fix program binary type bug.
>       runtime: fix build status handling.
>       GBE: fix multiple files compilation bugs.
>       Update readme.
>       Update readme.
>       Document fixup.
>       Remove out-of-date document.
>       Bump to 0.9.3.
>       Remove out-of-date document.
>       Update NEWS.
>       GBE/libocl: add missing vector builtin definition for fma.
>       GBE/libocl: fix a regression after libocl change.
>       Revert "improve the build performance of vector type built-in function."
>       GBE/libocl: fix build dependency issue.
>       GBE: fix a loop header file including bug.
>       GBE: structurized loop exit need an extra branching instruction when do reordering.
>       GBE: fix a bug in legalize pass.
>       GBE: do intrinsics lowering pass earlier.
>       GBE: fix a legalize pass bug when bitcast wide integer to incompaitble vector.
>       GBE: Add a customized loop unrolling handling mechanism.
>       GBE: disable custom loop unroll for LLVM 3.3/3.4.
>       GBE: add Selection instruction handler at legalize pass.
>       GBE: increase maximum src/dst operands to 32.
>       GBE: add basic PHINode support in legalize pass.
>       GBE: fix regression caused by simple block optimization.
>       GBE: handle dead loop BBs in liveness analysis.
>       GBE: set default address space to -1 to avoid incorrect unroll hint.
>       GBE: fix a wrong type of cl_device_info.
>       utest: change the box_blur_image to be identical to box_blur.
>       utests: replace the nodistriutable picture.
>       GBE: fix disassembly bug.
>       GBE: fix a bool handling bug when SEL on a uniform bool variable.
>       GBE: Support more instructions for constant expression handling.
>       GBE: remove useless debug info.
>       Revert "add test for clCreateImageFromLibvaIntel"
>       Revert "fix issue to create cl image from libva with non-zero offset"
>       utests: remove all shader toy test cases.
>       License: adjust all license version to LGPL v2.1+.
>       GBE: fix relocatable issue for pch file.
>       Revert "BDW: Change the default tiling mode to TILING_Y on BDW."
>       GBE: fix one double related bugs for post register scheduling.
>       update some documents.
>       runtime: fix one bug in BDW image.
>       Update documents.
>       runtime: refine version handling.
>       runtime: fix bug in cl_enqueue_read_buffer.
>       runtime: disable userptr due to random fail.
>       GBE: work around error reporting for unresolved symbols
>       Bump to 1.0.0.
>
>
> --
> Zhigang Gong,
> Thanks.
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet



-- 
-Igor Gnatenko


More information about the Beignet mailing list