[Beignet] [ANNOUNCE] Beignet 1.0.0 (2014-11-14)

Zhigang Gong zhigang.gong at linux.intel.com
Mon Nov 17 18:13:14 PST 2014


Hi Igor,

Thanks for packaging beignet for Fedora promptly. It helps promote latest beignet to normal users.

Thanks,
Zhigang Gong.

> -----Original Message-----
> From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf Of
> Igor Gnatenko
> Sent: Tuesday, November 18, 2014 4:12 AM
> To: Zhigang Gong
> Cc: michael.fu; Zou Nanhai; An open source open CL implemenation for Intel
> platform
> Subject: Re: [Beignet] [ANNOUNCE] Beignet 1.0.0 (2014-11-14)
> 
> Hi,
> 
> cool release! I've updated beignet to 1.0.0 for Fedora 20, 21 and Rawhide (22)
> today. Let's use it!
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1142892
> 
> On Fri, Nov 14, 2014 at 9:09 AM, Zhigang Gong <zhigang.gong at linux.intel.com>
> wrote:
> > Beignet 1.0.0 (2014-11-14)
> > =========================
> >
> > Beignet development team is proud to announce that Beignet 1.0.0 has
> > been released. This is an important milestone after about two years of
> > development. Thanks for everyone who helped us to improve it to
> > relatively mature state.
> >
> > Now beignet supports from 3rd to 5th Generation Intel Core Processors.
> > Besides the Broadwell support, this release also bring major
> > performance improvement for many workloads and fixed some bugs. We
> > observed 10% to more than 4x performance gain for some OpenCV 3.0
> benchmarks.
> >
> > The highlighted items are as below:
> >
> > 1. Added 5th generation Intel Core Processors (BDW) support.
> > 2. Optimized constant buffer load.
> > 3. Implement basic transformation from unstructurized control flow to
> >    structurized control flow to improve performance.
> > 4. Fixed some memory leak bugs.
> > 5. Implemented missing constant expression handling.
> > 6. Added Clang/ICC compiler support for Beignet build.
> > 7. Optimized unaligned char/short vector load.
> > 8. Speed up kernel compiling time by move built-in functions support
> >    from header file into linked library.
> > 9. Implemented some missing llvm intrinsics.
> > 10. Optimized loop unrolling pass, boosted some OpenCV benchmarks.
> > 11. Several other bug fixes since last release. For OpenCV 3.0 /
> >     OpenCV 2.4/piglit test suite, Beignet's pass rates are all
> >     above 99%.
> >
> > Git tag: Release_v1.0.0
> > Gitweb URL: http://cgit.freedesktop.org/beignet
> > https://01.org/sites/default/files/beignet-1.0.0-source.tar.gz
> >
> > md5sum: bfd755904c332cdd285d6058f5f3de8c  Beignet-1.0.0-Source.tar.gz
> > sha1sum: a2b0eb53e5f9a6055cd656531532a4c6ae03fbb0
> > Beignet-1.0.0-Source.tar.gz
> > sha256sum:
> > e30c4d0f4c8917fa0df2467b2d70a4ee524f28d54c42c582262d5f08928ea543
> > Beignet-1.0.0-Source.tar.gz
> >
> > -----------------------------------------------------------------
> >
> > Changes since 0.9.3:
> >
> > Andreas Beckmann (2):
> >       fix some typos
> >       use env to set environment variables for GBE_BIN_GENERATER
> >
> > Chuanbo Weng (1):
> >       utest: add new test that trigger an assignment operation bug in if.
> >
> > Guo Yejun (18):
> >       remove requirment as drm master in non-x environment
> >       remove requirment as drm master in non-x environment
> >       free build_log when the cl program is released
> >       free build_log when the cl program is released
> >       fix three memory leaks
> >       clean llvm resource in compiler (libgbe.so)
> >       fix three memory leaks
> >       clean llvm resource in compiler (libgbe.so)
> >       delete GEPInst when it is no longer used
> >       delete GEPInst when it is no longer used
> >       remove dependency for non-X runtime environment
> >       remove dependency for non-X runtime environment
> >       support CL_MEM_USE_HOST_PTR with userptr for cl buffer
> >       enable CL_DEVICE_HOST_UNIFIED_MEMORY when userptr is
> supported
> >       add test for cl buffer created with CL_MEM_USE_HOST_PTR
> >       fix issue to create cl image from libva with non-zero offset
> >       add test for clCreateImageFromLibvaIntel
> >       use posix_memalign instead of aligned_alloc to be more
> > compatible
> >
> > Junyan He (54):
> >       Fix the global string bug for printf.
> >       Fix a bug for runtime_barrier_list.cpp, event array out of bound
> >       Fix a bug for runtime_barrier_list.cpp, event array out of bound
> >       Fix the global string bug for printf.
> >       Add common define header files to initialize the libocl
> >       Add the async module into the libocl
> >       Add the atomic module into the libocl
> >       Add the geometric module into the libocl
> >       Add the image module into the libocl
> >       Add the misc module into the libocl
> >       Add the sync module into the libocl
> >       Add printf module into libocl
> >       Add vload module into the libocl
> >       Add thw workitem module into the libocl
> >       Add the convert and as modules into the libocl
> >       Add the gen_vector script into the libocl
> >       Add the common module into the libocl as template
> >       Add the integer module into libocl as template
> >       Add the math function into libocl as template
> >       Add the relational module into libocl as template
> >       Add the ocl_defines header file into libocl
> >       Add memcpy, memset and barrier bitcode files into libocl
> >       Add the bit code linker into the module pass.
> >       Enable libocl and disable the usage of the old huge header.
> >       Use the PCH to accelerate the parsing speed of the ocl.h
> >       Delete all the unused files of old huge header.
> >       Add the missing function prototypes of any() and atom_add()
> >       Add uncompatible PCH Options to avoid compiling failure.
> >       Fix the global string bug for printf.
> >       Add copyright header for all libocl files.
> >       Fix the issue of -cl-std=CLX.X option.
> >       Fix the issue of -cl-std=CLX.X option.
> >       Add the switch logic for math conformance fast path
> >       Modify the CMakeList to use the internal PCH first.
> >       Fix the bug of LLVM_LFLAGS fail to set
> >       Add long support for printf
> >       BDW: Add gen8 surface state struct.
> >       BDW: refine the gen8_surface_state_t.
> >       BDW: Add function intel_gpgpu_setup_bti for gen8.
> >       BDW: Correct surface base address set in setup bti.
> >       BDW: Add function intel_gpgpu_bind_buf for gen8.
> >       Add sampler state and tile define for gen8.
> >       Modify the bind sampler logic for gen8
> >       BDW: Add gen8 into intel_driver_init
> >       Refine the shared function ID define.
> >       Add the libdrm version check.
> >       Let the failure of intel_drm lib's check as a FATAL_ERROR
> >       Fit the printf bug in loop
> >       Fix the bug of 1D array slice pitch
> >       Add the test case for image 1d array fill
> >       Add the test case for image 2d array fill
> >       Add the disasm support for Gen8
> >       Fix the compare_image_2d_and_1d_array test case bug
> >       Fix the bug of multi-thread crash
> >
> > Luo (5):
> >       remove lspci, gbe_bin_genenrater would generator llvm binary by
> default.
> >       remove lspci, gbe_bin_genenrater would generator llvm binary by
> default.
> >       fix piglit get kernel info FUNCTION ATTRIBUTE fail.
> >       fix piglit get kernel info FUNCTION ATTRIBUTE fail.
> >       add opencl-1.2 builtin function popcount.
> >
> > Luo Xionghu (28):
> >       fix the relational built-in vector function regression.
> >       fix opencv_test_imgproc subcase OCL_ImgProc/Accumulate.Mask
> regression.
> >       fix piglit cl-api-get-program-info fail.
> >       fix piglit cl-api-get-program-info fail.
> >       fix clGetKernelWorkGroupInfo built-in kernel fail.
> >       fix piglit cl-api-set-kernel-arg fail.
> >       fix clGetKernelWorkGroupInfo built-in kernel fail.
> >       fix piglit cl-api-set-kernel-arg fail.
> >       fix bin/cl-program-tester tests/cl/program/execute/attributes.cl
> regression.
> >       fix bin/cl-program-tester tests/cl/program/execute/attributes.cl
> regression.
> >       remove the LinkOnceAnyLinkage since the libocl is introduced.
> >       improve the build performance of vector type built-in function.
> >       fix one bug at cl_get_kernel_workgroup_info.
> >       fix utest memory leak.
> >       Add Gen IR WHILE.
> >       add handleSelfLoopNode to insert while instruction on Gen IR level.
> >       Use instruction WHILE to manipulate structure.
> >       add utest popcount for all types.
> >       use global flag 0.0 to control unstructured simple block.
> >       add llvm Intrinsic call support.
> >       add utest compiler_overflow for llvm intrinsic function.
> >       enable llvm intrinsic call usub_with_overflow funtion.
> >       add utest for llvm intrinsic call usub_with_overflow funtion.
> >       enable llvm intrinsic call bswap function.
> >       add utest function bswap.
> >       fix bswap kernel function type issue.
> >       fix piglit clCreateProgramWithBinary fail.
> >       fix a bug in clCompileProgram().
> >
> > LuoXionghu (5):
> >       add platform info in the gen binary code.
> >       add utest load_program_from_gen_bin.
> >       add platform info in the gen binary code.
> >       add utest load_program_from_gen_bin.
> >       improve the build performance of vector type built-in function.
> >
> > Lv Meng (6):
> >       improve the clEnqueueCopyBufferRect performance in some cases
> >       Fix compile error for ICC compiler
> >       Fix compile errors for CLANG compiler
> >       Fix compile warnings for ICC compiler
> >       Fix compile warnings for CLANG compiler
> >       Enable ICC and CLANG compiler for beignet
> >
> > Meng Mengmeng (3):
> >       add beignet GIT_HAL1 if there is .git directory
> >       create GIT_SHA1 without any dependency
> >       add building dependency GIT_SHA1
> >
> > Rebecca Palmer (7):
> >       Fail gracefully on unsupported hardware
> >       Fail gracefully on unsupported hardware
> >       GBE: fix bug in pow()/pown().
> >       GBE: fix bug in erf()/erfc().
> >       GBE: fix bug in tgamma().
> >       utests: fix bugs in builtin_pow().
> >       utests: fix bugs in builtin_tgamma().
> >
> > Ruiling Song (43):
> >       GBE: Fix builtin tanpi.
> >       GBE: Fix builtin tanpi.
> >       GBE: Use varying register to save one instruction
> >       GBE: Optimize constant load with sampler.
> >       GBE: align the fields in union ImageInfoKey.
> >       utests: Fix a bug in image_1D_buffer.
> >       GBE: align the fields in union ImageInfoKey.
> >       utests: Fix a bug in image_1D_buffer.
> >       runtime: set correct state for constant buffer on hsw.
> >       runtime: set correct state for constant buffer on hsw.
> >       GBE: Refine bti usage in backend & runtime.
> >       GBE: Handle bti allocation for internal buffer used by printf.
> >       GBE: remove some useless code for getting printf buffer address.
> >       GBE: Fix a warning in getConstantPointerRegister.
> >       GBE: Fix type size for vector3
> >       GBE: initialize BTI structure to zero.
> >       GBE: Fix a bug in gatherBTI.
> >       cmake: Fix a license issue.
> >       GBE: clear deadprintfs when current function is done.
> >       GBE: refine the llvm multi-thread related code.
> >       GBE: Fix type size for vector3
> >       cmake: Fix a license issue.
> >       GBE: clear deadprintfs when current function is done.
> >       GBE: refine the llvm multi-thread related code.
> >       GBE: Optimize constant load with sampler.
> >       GBE: Refine bti usage in backend & runtime.
> >       GBE: Handle bti allocation for internal buffer used by printf.
> >       GBE: initialize BTI structure to zero.
> >       GBE: Fix a bug in gatherBTI.
> >       GBE/libocl: Fix sub_sat corner case.
> >       GBE: Fix sub_sat corner case.
> >       GBE: Output linkModules's error message.
> >       GBE/libocl: Add __gen_ocl_get_timestamp() to get timestamp.
> >       GBE: Fix a bug when setting flag register
> >       GBE: add legalize pass to handle wide integers
> >       Re-apply "improve the build performance of vector type built-in
> function."
> >       GBE: workaround register allocation fail caused by custom loop
> unroll.
> >       GBE: Fix live range for temporary register in replaceReg
> >       GBE: Fix kernel argument size for vector3
> >       utests: add a test to trigger cl_float3 bug in clSetKernelArg.
> >       GBE: Fix a bitcast from float vector to wide interger issue in legalize
> pass.
> >       GBE: Do topological sorting of basicblocks.
> >       docs: update mixed_buffer_pointer document.
> >
> > Yang Rong (54):
> >       Add some hsw missed pci ids (reserved PCI IDs).
> >       Add some hsw missed pci ids (reserved PCI IDs).
> >       Fix a utest compiler_async_stride_copy typo.
> >       Fix a utest compiler_async_stride_copy typo.
> >       Only compiler X11 files and do X11 operations when found X11.
> >       Only compiler X11 files and do X11 operations when found X11.
> >       Update Beignet.mdwn X11 dependency.
> >       Two minor fix.
> >       Fix two bugs.
> >       Update Beignet.mdwn X11 dependency.
> >       Two minor fix.
> >       Fix two bugs.
> >       Update README for the command parser in drm kernel.
> >       Update README for the command parser in drm kernel.
> >       Update license disclaimer.
> >       Update license disclaimer.
> >       Avoid use GenNativeInstruction directly out of GenEncode and
> gen_insn_compact.
> >       BDW: Add BDW pci ids and BDW device struct.
> >       BDW: Add BDW instruction define.
> >       BDW: Add Gen8Encoder and Gen7Encoder.
> >       BDW: Add class Gen8Context.
> >       BDW: Pass Jip and Uip when patchJMPI.
> >       BDW: Refine intel_gpgpu_setup_bti and add
> intel_gpgpu_set_base_address for BDW.
> >       BDW: add some BDW function.
> >       BDW: Fix Pointer argument curbe alloce size.
> >       BDW: enable SLM in BDW.
> >       BDW: Fix unsample bug.
> >       BDW: Refine BDW's int 32*32 multiply.
> >       BDW: BDW don't need add slm offset, remove it.
> >       BDW: Add BDW Device id to gen binary generater and binary serialize
> in backend.
> >       BDW: Add device's sub slice field, for cl_get_kernel_max_wg_sz.
> >       BDW: Correct scratch buffer of BDW.
> >       BDW: Forgot to set UIP of else in BDW.
> >       BDW: Correct BDW device name.
> >       BDW: Fix a scaler int 32*32 bug.
> >       BDW: Need not restore SLM setting in BDW.
> >       BDW: Correct stack setting in BDW.
> >       Fix a segment fault.
> >       Fix a HSW regression.
> >       Fix memcpy and memset bug.
> >       Fix HSW thread_n <= 64 assert.
> >       Fix a HSW constant buffer regression.
> >       BDW: Change BDW's max work group size to 512.
> >       BDW: Fix load/store half error.
> >       BDW: Also need set Shader Channel Select for constant buffer in
> BDW.
> >       Fix a upsample regression.
> >       Fix a HSW regression.
> >       Refine the the error handling in function
> cl_command_queue_ND_range_gen7.
> >       Refine the intel gpgpu delete.
> >       Fix a size assert when setup bti.
> >       BDW: Fix bwd 32*32 scalar multiplication bug.
> >       IVB/HSW/BYT: Revert the Dynamic state Base Addr and relative
> buffers address setting.
> >       BDW: Set the URB/REST size to 384K/384K when SLM disable.
> >       BDW: Change the default tiling mode to TILING_Y on BDW.
> >
> > Yichao Yu (1):
> >       Use ${PYTHON_EXECUTABLE} to run python scripts.
> >
> > Yongjia Zhang (6):
> >       Add Gen IR IF, ELSE and ENDIF
> >       Add Gen instruction 'else'
> >       Add structure identification on ir level
> >       Use instruction if else and endif manipulate structures
> >       Enable structural analysis
> >       GBE: fix empty block disassemble bug.
> >
> > Zhenyu Wang (5):
> >       Make use of write enable flag for mem bo map
> >       Clear batch buffer pointer after unmap
> >       Use pread/pwrite for buffer enqueue read/write
> >       Fix AUX buffer for page alignment
> >       Remove intel_gpgpu_check_binded_buf_address()
> >
> > Zhigang Gong (111):
> >       Build: Change versioning policy.
> >       runtime/driver: refine error handlings.
> >       runtime: fix some subtle event bugs.
> >       runtime/driver: refine error handlings.
> >       runtime: fix some subtle event bugs.
> >       gbe: add the new else instruction to the assert checking.
> >       docs: add a NEWS document to point to the release notes pages.
> >       docs: add a NEWS document to point to the release notes pages.
> >       Bump to 0.9.2.
> >       NEWS: update for 0.9.2.
> >       GBE: cleanup image base index related code.
> >       GBE: refine post register allocation scheduling for global buffers.
> >       GBE: refactor the immediate class to support vector data type.
> >       GBE: simplify processConstant.
> >       GBE: complete constant expression processing.
> >       GBE: enable constant expression processing.
> >       utest: add new test for constant expression processing.
> >       GBE: Reduce random behaviour of the code generation
> >       GBE: adjust preferred vector length.
> >       GBE: refactor the immediate class to support vector data type.
> >       GBE: simplify processConstant.
> >       GBE: complete constant expression processing.
> >       GBE: enable constant expression processing.
> >       utest: add new test for constant expression processing.
> >       Revert "GBE: refine post register allocation scheduling for global
> buffers."
> >       utests: fix two utest bugs.
> >       GBE: fix error in the rootn fastpath function for some special input.
> >       utests: fix two utest bugs.
> >       GBE: fix error in the rootn fastpath function for some special input.
> >       Add new vload benchmark/test case.
> >       GBE: optimize unaligned char and short data vector's load.
> >       GBE: relax the batch byte/short load vector size restrication.
> >       GBE: refine the unaligned data gathering.
> >       GBE: adjust preferred vector length.
> >       GBE: fixup/refine a bug for image1D array's extra binding index
> handling.
> >       GBE: remove the user defined macro cl_khr_fp64.
> >       GBE: avoid one optimization pass to generate wide integer.
> >       GBE: avoid one optimization pass to generate wide integer.
> >       GBE: fix a bug with LLVM 3.3.
> >       GBE: fallback if we get a wider than i64 constant.
> >       GBE: fix a bug with LLVM 3.3.
> >       GBE: fallback if we get a wider than i64 constant.
> >       GBE: cleanup image base index related code.
> >       GBE: fixup/refine a bug for image1D array's extra binding index
> handling.
> >       build: fix a CXXFLAGS override bug in backend directory.
> >       GBE: fix some predfeined OCL macros.
> >       Runtime: Implement clGetExtensionFunctionAddressForPlatform.
> >       Runtime: Implement clGetExtensionFunctionAddressForPlatform.
> >       GBE/libocl: fix the wrong prototype of scalar native_powr.
> >       GBE: fix bugs when handling -cl-std option.
> >       GBE: fix bugs when handling -cl-std option.
> >       GBE/libocl: Added one missing prototype fma().
> >       GBE: don't return error if we get an empty module.
> >       GBE: Fix a potential segfault.
> >       GBE: Fix a potential segfault.
> >       GBE: fix a potential memory leak bug.
> >       GBE: fix a potential memory leak bug.
> >       GBE: don't enable double by default.
> >       GBE: don't enable double by default.
> >       GBE: fix multiple files compilation bugs.
> >       runtime: fix program binary type bug.
> >       runtime: fix build status handling.
> >       runtime: fix program binary type bug.
> >       runtime: fix build status handling.
> >       GBE: fix multiple files compilation bugs.
> >       Update readme.
> >       Update readme.
> >       Document fixup.
> >       Remove out-of-date document.
> >       Bump to 0.9.3.
> >       Remove out-of-date document.
> >       Update NEWS.
> >       GBE/libocl: add missing vector builtin definition for fma.
> >       GBE/libocl: fix a regression after libocl change.
> >       Revert "improve the build performance of vector type built-in
> function."
> >       GBE/libocl: fix build dependency issue.
> >       GBE: fix a loop header file including bug.
> >       GBE: structurized loop exit need an extra branching instruction when
> do reordering.
> >       GBE: fix a bug in legalize pass.
> >       GBE: do intrinsics lowering pass earlier.
> >       GBE: fix a legalize pass bug when bitcast wide integer to
> incompaitble vector.
> >       GBE: Add a customized loop unrolling handling mechanism.
> >       GBE: disable custom loop unroll for LLVM 3.3/3.4.
> >       GBE: add Selection instruction handler at legalize pass.
> >       GBE: increase maximum src/dst operands to 32.
> >       GBE: add basic PHINode support in legalize pass.
> >       GBE: fix regression caused by simple block optimization.
> >       GBE: handle dead loop BBs in liveness analysis.
> >       GBE: set default address space to -1 to avoid incorrect unroll hint.
> >       GBE: fix a wrong type of cl_device_info.
> >       utest: change the box_blur_image to be identical to box_blur.
> >       utests: replace the nodistriutable picture.
> >       GBE: fix disassembly bug.
> >       GBE: fix a bool handling bug when SEL on a uniform bool variable.
> >       GBE: Support more instructions for constant expression handling.
> >       GBE: remove useless debug info.
> >       Revert "add test for clCreateImageFromLibvaIntel"
> >       Revert "fix issue to create cl image from libva with non-zero offset"
> >       utests: remove all shader toy test cases.
> >       License: adjust all license version to LGPL v2.1+.
> >       GBE: fix relocatable issue for pch file.
> >       Revert "BDW: Change the default tiling mode to TILING_Y on BDW."
> >       GBE: fix one double related bugs for post register scheduling.
> >       update some documents.
> >       runtime: fix one bug in BDW image.
> >       Update documents.
> >       runtime: refine version handling.
> >       runtime: fix bug in cl_enqueue_read_buffer.
> >       runtime: disable userptr due to random fail.
> >       GBE: work around error reporting for unresolved symbols
> >       Bump to 1.0.0.
> >
> >
> > --
> > Zhigang Gong,
> > Thanks.
> > _______________________________________________
> > Beignet mailing list
> > Beignet at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/beignet
> 
> 
> 
> --
> -Igor Gnatenko
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet



More information about the Beignet mailing list