[Beignet] [ANNOUNCE] Beignet 1.0.0 (2014-11-14)
Zhigang Gong
zhigang.gong at linux.intel.com
Mon Nov 17 18:13:14 PST 2014
Hi Igor,
Thanks for packaging beignet for Fedora promptly. It helps promote latest beignet to normal users.
Thanks,
Zhigang Gong.
> -----Original Message-----
> From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf Of
> Igor Gnatenko
> Sent: Tuesday, November 18, 2014 4:12 AM
> To: Zhigang Gong
> Cc: michael.fu; Zou Nanhai; An open source open CL implemenation for Intel
> platform
> Subject: Re: [Beignet] [ANNOUNCE] Beignet 1.0.0 (2014-11-14)
>
> Hi,
>
> cool release! I've updated beignet to 1.0.0 for Fedora 20, 21 and Rawhide (22)
> today. Let's use it!
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1142892
>
> On Fri, Nov 14, 2014 at 9:09 AM, Zhigang Gong <zhigang.gong at linux.intel.com>
> wrote:
> > Beignet 1.0.0 (2014-11-14)
> > =========================
> >
> > Beignet development team is proud to announce that Beignet 1.0.0 has
> > been released. This is an important milestone after about two years of
> > development. Thanks for everyone who helped us to improve it to
> > relatively mature state.
> >
> > Now beignet supports from 3rd to 5th Generation Intel Core Processors.
> > Besides the Broadwell support, this release also bring major
> > performance improvement for many workloads and fixed some bugs. We
> > observed 10% to more than 4x performance gain for some OpenCV 3.0
> benchmarks.
> >
> > The highlighted items are as below:
> >
> > 1. Added 5th generation Intel Core Processors (BDW) support.
> > 2. Optimized constant buffer load.
> > 3. Implement basic transformation from unstructurized control flow to
> > structurized control flow to improve performance.
> > 4. Fixed some memory leak bugs.
> > 5. Implemented missing constant expression handling.
> > 6. Added Clang/ICC compiler support for Beignet build.
> > 7. Optimized unaligned char/short vector load.
> > 8. Speed up kernel compiling time by move built-in functions support
> > from header file into linked library.
> > 9. Implemented some missing llvm intrinsics.
> > 10. Optimized loop unrolling pass, boosted some OpenCV benchmarks.
> > 11. Several other bug fixes since last release. For OpenCV 3.0 /
> > OpenCV 2.4/piglit test suite, Beignet's pass rates are all
> > above 99%.
> >
> > Git tag: Release_v1.0.0
> > Gitweb URL: http://cgit.freedesktop.org/beignet
> > https://01.org/sites/default/files/beignet-1.0.0-source.tar.gz
> >
> > md5sum: bfd755904c332cdd285d6058f5f3de8c Beignet-1.0.0-Source.tar.gz
> > sha1sum: a2b0eb53e5f9a6055cd656531532a4c6ae03fbb0
> > Beignet-1.0.0-Source.tar.gz
> > sha256sum:
> > e30c4d0f4c8917fa0df2467b2d70a4ee524f28d54c42c582262d5f08928ea543
> > Beignet-1.0.0-Source.tar.gz
> >
> > -----------------------------------------------------------------
> >
> > Changes since 0.9.3:
> >
> > Andreas Beckmann (2):
> > fix some typos
> > use env to set environment variables for GBE_BIN_GENERATER
> >
> > Chuanbo Weng (1):
> > utest: add new test that trigger an assignment operation bug in if.
> >
> > Guo Yejun (18):
> > remove requirment as drm master in non-x environment
> > remove requirment as drm master in non-x environment
> > free build_log when the cl program is released
> > free build_log when the cl program is released
> > fix three memory leaks
> > clean llvm resource in compiler (libgbe.so)
> > fix three memory leaks
> > clean llvm resource in compiler (libgbe.so)
> > delete GEPInst when it is no longer used
> > delete GEPInst when it is no longer used
> > remove dependency for non-X runtime environment
> > remove dependency for non-X runtime environment
> > support CL_MEM_USE_HOST_PTR with userptr for cl buffer
> > enable CL_DEVICE_HOST_UNIFIED_MEMORY when userptr is
> supported
> > add test for cl buffer created with CL_MEM_USE_HOST_PTR
> > fix issue to create cl image from libva with non-zero offset
> > add test for clCreateImageFromLibvaIntel
> > use posix_memalign instead of aligned_alloc to be more
> > compatible
> >
> > Junyan He (54):
> > Fix the global string bug for printf.
> > Fix a bug for runtime_barrier_list.cpp, event array out of bound
> > Fix a bug for runtime_barrier_list.cpp, event array out of bound
> > Fix the global string bug for printf.
> > Add common define header files to initialize the libocl
> > Add the async module into the libocl
> > Add the atomic module into the libocl
> > Add the geometric module into the libocl
> > Add the image module into the libocl
> > Add the misc module into the libocl
> > Add the sync module into the libocl
> > Add printf module into libocl
> > Add vload module into the libocl
> > Add thw workitem module into the libocl
> > Add the convert and as modules into the libocl
> > Add the gen_vector script into the libocl
> > Add the common module into the libocl as template
> > Add the integer module into libocl as template
> > Add the math function into libocl as template
> > Add the relational module into libocl as template
> > Add the ocl_defines header file into libocl
> > Add memcpy, memset and barrier bitcode files into libocl
> > Add the bit code linker into the module pass.
> > Enable libocl and disable the usage of the old huge header.
> > Use the PCH to accelerate the parsing speed of the ocl.h
> > Delete all the unused files of old huge header.
> > Add the missing function prototypes of any() and atom_add()
> > Add uncompatible PCH Options to avoid compiling failure.
> > Fix the global string bug for printf.
> > Add copyright header for all libocl files.
> > Fix the issue of -cl-std=CLX.X option.
> > Fix the issue of -cl-std=CLX.X option.
> > Add the switch logic for math conformance fast path
> > Modify the CMakeList to use the internal PCH first.
> > Fix the bug of LLVM_LFLAGS fail to set
> > Add long support for printf
> > BDW: Add gen8 surface state struct.
> > BDW: refine the gen8_surface_state_t.
> > BDW: Add function intel_gpgpu_setup_bti for gen8.
> > BDW: Correct surface base address set in setup bti.
> > BDW: Add function intel_gpgpu_bind_buf for gen8.
> > Add sampler state and tile define for gen8.
> > Modify the bind sampler logic for gen8
> > BDW: Add gen8 into intel_driver_init
> > Refine the shared function ID define.
> > Add the libdrm version check.
> > Let the failure of intel_drm lib's check as a FATAL_ERROR
> > Fit the printf bug in loop
> > Fix the bug of 1D array slice pitch
> > Add the test case for image 1d array fill
> > Add the test case for image 2d array fill
> > Add the disasm support for Gen8
> > Fix the compare_image_2d_and_1d_array test case bug
> > Fix the bug of multi-thread crash
> >
> > Luo (5):
> > remove lspci, gbe_bin_genenrater would generator llvm binary by
> default.
> > remove lspci, gbe_bin_genenrater would generator llvm binary by
> default.
> > fix piglit get kernel info FUNCTION ATTRIBUTE fail.
> > fix piglit get kernel info FUNCTION ATTRIBUTE fail.
> > add opencl-1.2 builtin function popcount.
> >
> > Luo Xionghu (28):
> > fix the relational built-in vector function regression.
> > fix opencv_test_imgproc subcase OCL_ImgProc/Accumulate.Mask
> regression.
> > fix piglit cl-api-get-program-info fail.
> > fix piglit cl-api-get-program-info fail.
> > fix clGetKernelWorkGroupInfo built-in kernel fail.
> > fix piglit cl-api-set-kernel-arg fail.
> > fix clGetKernelWorkGroupInfo built-in kernel fail.
> > fix piglit cl-api-set-kernel-arg fail.
> > fix bin/cl-program-tester tests/cl/program/execute/attributes.cl
> regression.
> > fix bin/cl-program-tester tests/cl/program/execute/attributes.cl
> regression.
> > remove the LinkOnceAnyLinkage since the libocl is introduced.
> > improve the build performance of vector type built-in function.
> > fix one bug at cl_get_kernel_workgroup_info.
> > fix utest memory leak.
> > Add Gen IR WHILE.
> > add handleSelfLoopNode to insert while instruction on Gen IR level.
> > Use instruction WHILE to manipulate structure.
> > add utest popcount for all types.
> > use global flag 0.0 to control unstructured simple block.
> > add llvm Intrinsic call support.
> > add utest compiler_overflow for llvm intrinsic function.
> > enable llvm intrinsic call usub_with_overflow funtion.
> > add utest for llvm intrinsic call usub_with_overflow funtion.
> > enable llvm intrinsic call bswap function.
> > add utest function bswap.
> > fix bswap kernel function type issue.
> > fix piglit clCreateProgramWithBinary fail.
> > fix a bug in clCompileProgram().
> >
> > LuoXionghu (5):
> > add platform info in the gen binary code.
> > add utest load_program_from_gen_bin.
> > add platform info in the gen binary code.
> > add utest load_program_from_gen_bin.
> > improve the build performance of vector type built-in function.
> >
> > Lv Meng (6):
> > improve the clEnqueueCopyBufferRect performance in some cases
> > Fix compile error for ICC compiler
> > Fix compile errors for CLANG compiler
> > Fix compile warnings for ICC compiler
> > Fix compile warnings for CLANG compiler
> > Enable ICC and CLANG compiler for beignet
> >
> > Meng Mengmeng (3):
> > add beignet GIT_HAL1 if there is .git directory
> > create GIT_SHA1 without any dependency
> > add building dependency GIT_SHA1
> >
> > Rebecca Palmer (7):
> > Fail gracefully on unsupported hardware
> > Fail gracefully on unsupported hardware
> > GBE: fix bug in pow()/pown().
> > GBE: fix bug in erf()/erfc().
> > GBE: fix bug in tgamma().
> > utests: fix bugs in builtin_pow().
> > utests: fix bugs in builtin_tgamma().
> >
> > Ruiling Song (43):
> > GBE: Fix builtin tanpi.
> > GBE: Fix builtin tanpi.
> > GBE: Use varying register to save one instruction
> > GBE: Optimize constant load with sampler.
> > GBE: align the fields in union ImageInfoKey.
> > utests: Fix a bug in image_1D_buffer.
> > GBE: align the fields in union ImageInfoKey.
> > utests: Fix a bug in image_1D_buffer.
> > runtime: set correct state for constant buffer on hsw.
> > runtime: set correct state for constant buffer on hsw.
> > GBE: Refine bti usage in backend & runtime.
> > GBE: Handle bti allocation for internal buffer used by printf.
> > GBE: remove some useless code for getting printf buffer address.
> > GBE: Fix a warning in getConstantPointerRegister.
> > GBE: Fix type size for vector3
> > GBE: initialize BTI structure to zero.
> > GBE: Fix a bug in gatherBTI.
> > cmake: Fix a license issue.
> > GBE: clear deadprintfs when current function is done.
> > GBE: refine the llvm multi-thread related code.
> > GBE: Fix type size for vector3
> > cmake: Fix a license issue.
> > GBE: clear deadprintfs when current function is done.
> > GBE: refine the llvm multi-thread related code.
> > GBE: Optimize constant load with sampler.
> > GBE: Refine bti usage in backend & runtime.
> > GBE: Handle bti allocation for internal buffer used by printf.
> > GBE: initialize BTI structure to zero.
> > GBE: Fix a bug in gatherBTI.
> > GBE/libocl: Fix sub_sat corner case.
> > GBE: Fix sub_sat corner case.
> > GBE: Output linkModules's error message.
> > GBE/libocl: Add __gen_ocl_get_timestamp() to get timestamp.
> > GBE: Fix a bug when setting flag register
> > GBE: add legalize pass to handle wide integers
> > Re-apply "improve the build performance of vector type built-in
> function."
> > GBE: workaround register allocation fail caused by custom loop
> unroll.
> > GBE: Fix live range for temporary register in replaceReg
> > GBE: Fix kernel argument size for vector3
> > utests: add a test to trigger cl_float3 bug in clSetKernelArg.
> > GBE: Fix a bitcast from float vector to wide interger issue in legalize
> pass.
> > GBE: Do topological sorting of basicblocks.
> > docs: update mixed_buffer_pointer document.
> >
> > Yang Rong (54):
> > Add some hsw missed pci ids (reserved PCI IDs).
> > Add some hsw missed pci ids (reserved PCI IDs).
> > Fix a utest compiler_async_stride_copy typo.
> > Fix a utest compiler_async_stride_copy typo.
> > Only compiler X11 files and do X11 operations when found X11.
> > Only compiler X11 files and do X11 operations when found X11.
> > Update Beignet.mdwn X11 dependency.
> > Two minor fix.
> > Fix two bugs.
> > Update Beignet.mdwn X11 dependency.
> > Two minor fix.
> > Fix two bugs.
> > Update README for the command parser in drm kernel.
> > Update README for the command parser in drm kernel.
> > Update license disclaimer.
> > Update license disclaimer.
> > Avoid use GenNativeInstruction directly out of GenEncode and
> gen_insn_compact.
> > BDW: Add BDW pci ids and BDW device struct.
> > BDW: Add BDW instruction define.
> > BDW: Add Gen8Encoder and Gen7Encoder.
> > BDW: Add class Gen8Context.
> > BDW: Pass Jip and Uip when patchJMPI.
> > BDW: Refine intel_gpgpu_setup_bti and add
> intel_gpgpu_set_base_address for BDW.
> > BDW: add some BDW function.
> > BDW: Fix Pointer argument curbe alloce size.
> > BDW: enable SLM in BDW.
> > BDW: Fix unsample bug.
> > BDW: Refine BDW's int 32*32 multiply.
> > BDW: BDW don't need add slm offset, remove it.
> > BDW: Add BDW Device id to gen binary generater and binary serialize
> in backend.
> > BDW: Add device's sub slice field, for cl_get_kernel_max_wg_sz.
> > BDW: Correct scratch buffer of BDW.
> > BDW: Forgot to set UIP of else in BDW.
> > BDW: Correct BDW device name.
> > BDW: Fix a scaler int 32*32 bug.
> > BDW: Need not restore SLM setting in BDW.
> > BDW: Correct stack setting in BDW.
> > Fix a segment fault.
> > Fix a HSW regression.
> > Fix memcpy and memset bug.
> > Fix HSW thread_n <= 64 assert.
> > Fix a HSW constant buffer regression.
> > BDW: Change BDW's max work group size to 512.
> > BDW: Fix load/store half error.
> > BDW: Also need set Shader Channel Select for constant buffer in
> BDW.
> > Fix a upsample regression.
> > Fix a HSW regression.
> > Refine the the error handling in function
> cl_command_queue_ND_range_gen7.
> > Refine the intel gpgpu delete.
> > Fix a size assert when setup bti.
> > BDW: Fix bwd 32*32 scalar multiplication bug.
> > IVB/HSW/BYT: Revert the Dynamic state Base Addr and relative
> buffers address setting.
> > BDW: Set the URB/REST size to 384K/384K when SLM disable.
> > BDW: Change the default tiling mode to TILING_Y on BDW.
> >
> > Yichao Yu (1):
> > Use ${PYTHON_EXECUTABLE} to run python scripts.
> >
> > Yongjia Zhang (6):
> > Add Gen IR IF, ELSE and ENDIF
> > Add Gen instruction 'else'
> > Add structure identification on ir level
> > Use instruction if else and endif manipulate structures
> > Enable structural analysis
> > GBE: fix empty block disassemble bug.
> >
> > Zhenyu Wang (5):
> > Make use of write enable flag for mem bo map
> > Clear batch buffer pointer after unmap
> > Use pread/pwrite for buffer enqueue read/write
> > Fix AUX buffer for page alignment
> > Remove intel_gpgpu_check_binded_buf_address()
> >
> > Zhigang Gong (111):
> > Build: Change versioning policy.
> > runtime/driver: refine error handlings.
> > runtime: fix some subtle event bugs.
> > runtime/driver: refine error handlings.
> > runtime: fix some subtle event bugs.
> > gbe: add the new else instruction to the assert checking.
> > docs: add a NEWS document to point to the release notes pages.
> > docs: add a NEWS document to point to the release notes pages.
> > Bump to 0.9.2.
> > NEWS: update for 0.9.2.
> > GBE: cleanup image base index related code.
> > GBE: refine post register allocation scheduling for global buffers.
> > GBE: refactor the immediate class to support vector data type.
> > GBE: simplify processConstant.
> > GBE: complete constant expression processing.
> > GBE: enable constant expression processing.
> > utest: add new test for constant expression processing.
> > GBE: Reduce random behaviour of the code generation
> > GBE: adjust preferred vector length.
> > GBE: refactor the immediate class to support vector data type.
> > GBE: simplify processConstant.
> > GBE: complete constant expression processing.
> > GBE: enable constant expression processing.
> > utest: add new test for constant expression processing.
> > Revert "GBE: refine post register allocation scheduling for global
> buffers."
> > utests: fix two utest bugs.
> > GBE: fix error in the rootn fastpath function for some special input.
> > utests: fix two utest bugs.
> > GBE: fix error in the rootn fastpath function for some special input.
> > Add new vload benchmark/test case.
> > GBE: optimize unaligned char and short data vector's load.
> > GBE: relax the batch byte/short load vector size restrication.
> > GBE: refine the unaligned data gathering.
> > GBE: adjust preferred vector length.
> > GBE: fixup/refine a bug for image1D array's extra binding index
> handling.
> > GBE: remove the user defined macro cl_khr_fp64.
> > GBE: avoid one optimization pass to generate wide integer.
> > GBE: avoid one optimization pass to generate wide integer.
> > GBE: fix a bug with LLVM 3.3.
> > GBE: fallback if we get a wider than i64 constant.
> > GBE: fix a bug with LLVM 3.3.
> > GBE: fallback if we get a wider than i64 constant.
> > GBE: cleanup image base index related code.
> > GBE: fixup/refine a bug for image1D array's extra binding index
> handling.
> > build: fix a CXXFLAGS override bug in backend directory.
> > GBE: fix some predfeined OCL macros.
> > Runtime: Implement clGetExtensionFunctionAddressForPlatform.
> > Runtime: Implement clGetExtensionFunctionAddressForPlatform.
> > GBE/libocl: fix the wrong prototype of scalar native_powr.
> > GBE: fix bugs when handling -cl-std option.
> > GBE: fix bugs when handling -cl-std option.
> > GBE/libocl: Added one missing prototype fma().
> > GBE: don't return error if we get an empty module.
> > GBE: Fix a potential segfault.
> > GBE: Fix a potential segfault.
> > GBE: fix a potential memory leak bug.
> > GBE: fix a potential memory leak bug.
> > GBE: don't enable double by default.
> > GBE: don't enable double by default.
> > GBE: fix multiple files compilation bugs.
> > runtime: fix program binary type bug.
> > runtime: fix build status handling.
> > runtime: fix program binary type bug.
> > runtime: fix build status handling.
> > GBE: fix multiple files compilation bugs.
> > Update readme.
> > Update readme.
> > Document fixup.
> > Remove out-of-date document.
> > Bump to 0.9.3.
> > Remove out-of-date document.
> > Update NEWS.
> > GBE/libocl: add missing vector builtin definition for fma.
> > GBE/libocl: fix a regression after libocl change.
> > Revert "improve the build performance of vector type built-in
> function."
> > GBE/libocl: fix build dependency issue.
> > GBE: fix a loop header file including bug.
> > GBE: structurized loop exit need an extra branching instruction when
> do reordering.
> > GBE: fix a bug in legalize pass.
> > GBE: do intrinsics lowering pass earlier.
> > GBE: fix a legalize pass bug when bitcast wide integer to
> incompaitble vector.
> > GBE: Add a customized loop unrolling handling mechanism.
> > GBE: disable custom loop unroll for LLVM 3.3/3.4.
> > GBE: add Selection instruction handler at legalize pass.
> > GBE: increase maximum src/dst operands to 32.
> > GBE: add basic PHINode support in legalize pass.
> > GBE: fix regression caused by simple block optimization.
> > GBE: handle dead loop BBs in liveness analysis.
> > GBE: set default address space to -1 to avoid incorrect unroll hint.
> > GBE: fix a wrong type of cl_device_info.
> > utest: change the box_blur_image to be identical to box_blur.
> > utests: replace the nodistriutable picture.
> > GBE: fix disassembly bug.
> > GBE: fix a bool handling bug when SEL on a uniform bool variable.
> > GBE: Support more instructions for constant expression handling.
> > GBE: remove useless debug info.
> > Revert "add test for clCreateImageFromLibvaIntel"
> > Revert "fix issue to create cl image from libva with non-zero offset"
> > utests: remove all shader toy test cases.
> > License: adjust all license version to LGPL v2.1+.
> > GBE: fix relocatable issue for pch file.
> > Revert "BDW: Change the default tiling mode to TILING_Y on BDW."
> > GBE: fix one double related bugs for post register scheduling.
> > update some documents.
> > runtime: fix one bug in BDW image.
> > Update documents.
> > runtime: refine version handling.
> > runtime: fix bug in cl_enqueue_read_buffer.
> > runtime: disable userptr due to random fail.
> > GBE: work around error reporting for unresolved symbols
> > Bump to 1.0.0.
> >
> >
> > --
> > Zhigang Gong,
> > Thanks.
> > _______________________________________________
> > Beignet mailing list
> > Beignet at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/beignet
>
>
>
> --
> -Igor Gnatenko
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet
More information about the Beignet
mailing list