[Beignet] [ANNOUNCE] Release notes for beignet version 0.3

Tue Oct 22 17:23:44 PDT 2013

Congratulations!

Homer Hsing

From: beignet-bounces+homer.xing=intel.com at lists.freedesktop.org [mailto:beignet-bounces+homer.xing=intel.com at lists.freedesktop.org] On Behalf Of Yang, Rong R
Sent: Tuesday, October 22, 2013 4:08 PM
To: beignet at lists.freedesktop.org
Subject: [Beignet] [ANNOUNCE] Release notes for beignet version 0.3

Release 0.3 (2013-10-22)
=========================

What's new?
===========

In this release, the major improvement of this version is to add many new
features and fixs. There are a bunch of new features/enhancement/bug
fixes in this release. Here is a short list:

1. Implemented all builtin functions.
2. Support Long/ulong.
3. Support register spilling/filling.
4. Support event.
5. Support profiling.
6. Experimantal integration with libva driver(use OpenCL to do video post processing for libva).
7. Implement more OpenCL APIs.
   clEnqueueCopyImage/ImageToBuffer/BufferToImage/...
8. Implement load/store the binary from/to program.
9. Fix some random hang bugs.
10. Other bug fixes.

Where can I get it?
===================

git address: git://anongit.freedesktop.org/beignet
git tag: Release_v0.3
tar ball: http://cgit.freedesktop.org/beignet/snapshot/Release_v0.3.tar.gz
mail list: http://lists.freedesktop.org/mailman/listinfo/beignet

What is it?
===========
Beignet is an open source implementation of the OpenCL specification - a generic
compute oriented API. This code base contains the code to run OpenCL programs on
Intel GPUs which basically defines and implements the OpenCL host functions
required to initialize the device, create the command queues, the kernels and
the programs and run them on the GPU. The code base also contains the compiler
part of the stack which is included in `backend/`. For more specific information
about the compiler, please refer to `backend/README.md`

Visit wiki: http://wiki.freedesktop.org/www/Software/Beignet/  for more information
about the project.

What changed in detail since version 0.3.
=======================
Boqun Feng (1):
      GBE: define python interpreter by cmake variable

Chuanbo Weng (1):
      Add a test case that trigger a known bug.

Homer Hsing (92):
      support built-in functions "mul24", "mad24"
      test cases for "mul24", "mad24"
      support built-in functions "degrees" and "radians"
      test built-in functions "degrees" and "radians"
      support const indexed global constant array
      test const-indexed global constant array
      support built-in function "bitselect"
      test built-in function "bitselect"
      improve clCreateContext conformance
      support clGetImageInfo
      test API function "clGetImageInfo"
      support built-in function "smoothstep"
      test function "smoothstep"
      add built-in function "mad_sat"
      test function "mad_sat"
      built-in function "sign"
      test built-in function "sign"
      fix vectorial built-in functions "min, max, clamp"
      add built-in function "frexp"
      test built-in function "frexp"
      add built-in function "nextafter"
      test built-in function "nextafter"
      add builtin function "modf"
      test builtin function "modf"
      add built-in function "remquo"
      test built-in function "remquo"
      add address_space modifier to builtin functions' pointer
      add builtin function "shuffle"
      test builtin function "shuffle"
      revise built-in function "shuffle"
      add address space qualifier to "remquo"
      add address space qualifier to "modf"
      add built-in function "islessgreater"
      add built-in function "isordered", "isunordered"
      add built-in function "shuffle2"
      test built-in function "shuffle2"
      test if register allocation and 64-bit reading are fixed
      support 64bit-integer reading(writing)
      support 64bit-integer addition, subtraction
      support 64bit-integer immediate value
      support 64bit-integer AND(&), OR(|), XOR(^) arithmetic
      test 64bit-integer immediate value, and "and", "or", "xor" arithmetic
      keep address space qualifier of pointers
      support 64bit-integer selection operator "?:"
      test 64bit-integer selection operator
      no "div by zero" in smoothstep test case
      Define temporary reg as dest reg of instruction
      support converting shorter int to 64bit int
      support 64bit-integer shifting
      test 64bit-integer shifting
      support 64bit-integer multiplication
      support 64bit-integer comparing
      test 64bit-integer comparing
      support built-in function mad_sat(int) and mad_sat(uint)
      add empty 64bit-integer version built-in functions
      add 64bit version of "upsample"
      test 64bit version of "upsample"
      fix a typo
      GBE: skip instruction pattern match for 64 bit sel_cmp.
      enable unsigned 64bit version of "abs_diff"
      enable signed 64-bit version of "abs_diff"
      improve built-in function "sinpi"
      add built-in function "tgamma"
      add built-in function "lgamma", "lgamma_r"
      add 64-bit version of "bitselect"
      fix scalar type built-in function "select"
      Add scalar version of "convert_*(*)"
      add 64-bit version of "shuffle", "shuffle2"
      fix 8-bit version of "clz"
      add 64-bit version of "clz"
      add 64-bit version of "rotate"
      fix 32-bit signed version of "sub_sat"
      add same type "convert_*(*)"
      fix GPU data type for 16-bit moving
      fix 64-bit "clz" if parameter is "long4" or "ulong4"
      add built-in function "atan2"
      support converting 64-bit integer to shorter integer
      add 64-bit version of "hadd"
      add built-in function "atan2pi"
      support converting 64-bit integer to 32-bit float
      add 64-bit version of "rhadd"
      fix scalarizing of llvm phi node
      64-bit-int: allocate flag register by RA
      fix 64bit writing
      add 64-bit version of "mul_hi"
      add 64-bit version of "mad_sat"
      support 64-bit version "add_sat"
      add 64-bit version of "sub_sat"
      support 64-bit division and remainder
      fix isnan (builtin function)
      saturated conversion of native GPU data type, larger to narrower
      support LLVM 3.4

Junyan He (16):
      Improve the clEnqueueMapBuffer and clCreateBuffer API
      Add the support for clSetMemObjectDestructorCallback API
      Improve the clGetMemObjectInfo API, add more info option
      Add the PCH support when building the source.
      Add the serialization support for backend
      Add one tool program to build and serial the program.
      Implement the clCreateProgramWithBinary to deseralize the binary.
      Add a test case for binary load.
      Add the virtual dctr function of Serialization to kill warning.
      Add the string format support for gbe_bin_generater
      Add the internal used kernels for buffer copy
      Implement the clEnqueueCopyBuffer API using internal binary kernel
      Add the test case for clEnqueueCopyBuffer
      Delete the redundant intel_batchbuffer_t init in intel_gpgpu_new
      Implement the CL api for clGetEventProfilingInfo
      Using the PIPE_CONTROL to implement get time stamp in gen backend

Lu Guanqun (9):
      fix left shift warnings in utests
      fix left shift warning
      fix warning when egl is not there
      rename ulong to ulong64 to avoid the conflicts in <sys/types.h>
      list all available utests' names
      add a space to make the error more readable
      we should check the 'err' parameter
      fix the missing assignment for offset
      refactor the api of intel_driver_share_buffer

Ruiling Song (23):
      Fix a bug in stack calculation.
      enable scratch memory allocation and read/write
      Implement spill/unspill
      Fix a re-schedule issue of scratch write
      Skip spill/unspill instruction when trying to do spill.
      GBE: Clear Flag register to fix a gpu hang.
      utests: Add a unit test for non-aligned group size.
      Fix utest compiler_group_size4 error.
      utest: memset the output buffer to fix random fail.
      GBE: Enable DWord scatter gather message for constant cache read.
      Change constant unit test to cover 4 byte data type.
      Implement constant buffer based on constant cache.
      Fix non-4byte program global constant issue.
     change constant test case to cover short/long type.
      GBE: Support composite type constant.
      utests: add more constant test cases for composite type.
      GBE: Fix a constant bug which over-write memory.
      utests: put compiler_vector_inc into known issue list.
      GBE: Support local variable inside kernel function.
      GBE: Update program binary format.
      GBE: Inline all function calls.
      GBE: Skip non-kernel functions in backend passes.
      utests: add test cases for function call.

Simon Richter (4):
      Fix OpenCL C version format
      Use access() instead of fopen() to search for PCH
      Add generated header and PCH to gitignore
      ICD dispatch table must be first

Yang Rong (44):
      Add build clang option fno-builtin to disable intrinsics.
      Add the empty functions of cl_enqueueXXX.
      Add a struct and a function to handle all implemented enqueue api.
      Add some functions to support event in intel gpgpu.
      Add function cl_command_queue_flush to flush a command
      Add openCL event support.
      Add event unit test.
      Add bool move imm support.
      Add a load bool imm test case.
      Fix event pthread_mutex_lock dead lock.
      Fix unit test compiler_load_bool_imm error.
      Implement async and prefetch built-in.
      Add async copy and async stride copy test case.
      Add pfn_notify support in clCreateContext.
      Add clEnqueueMapBuffer and clEnqueueMapImage non-blocking map support.
      Change event test case to cover clEnqueueMapBuffer.
      Correct event type' typo.
      Fix atomic_xchg float type error.
      Add clEnqueueReadBufferRect api.
      Add clEnqueueWriteBufferRect api.
      Add clEnqueueCopyBufferRect api.
      Add api clEnqueueCopyImage.
      Implement api clEnqueueTask and clEnqueueNativeKernel.
      Implement api clEnqueueCopyImageToBuffer.
      Implement api clEnqueueCopyBufferToImage.
      Fix cl_mem_kernel_copy_image typo.
      Remove non-used data in clEnqueueMapImage to fix, and fix a clGetEventInfo bug.
      Refine and fix some event bugs.
      Implement clEnqueueMarker and clEnqueueBarrier.
      Fix store undef value assert.
      Unmap the cl_mem in driver when application map a cl_mem and release without unmap.
      Fix clEnqueueMapImage error.
      Remove global offset need divide by local size restriction.
      Change optimize level to -O2, to avoid loopunswitch opt.
      Remove blocking asserts in clEnqueueXXX apis.
      Add some preprocessor macros __IMAGE_SUPPORT__ and __FAST_RELAXED_MATH__ define.
      Implement api clCreateKernelsInProgram.
      Fix a vector argument deallocate assert.
      Refine vector register deallocate.
      Change -O3 to -O2 again because my previous change's typo.
      Fix a read64/write64 schedule bug.
      Add type long/ulong/double's async copy.
      Remove newValueProxy from scalarize pass to genWriter pass.
      Add test case for newValueProxy of InsertElementInst.

Yi Sun (9):
      utest: add built-in test case for get_global_id.
      utest: Add test for built-in function get_local_size.
      utest: Add test for built-in function get_local_id.
      utests: Add a test case for built-in functions get_num_groups.
      Improve the accuracy of built-in function asin.
      Handle boundary and illegal values.
      utest: Add test case for function acos/acosh/asin/asinh.
      Utests_run: Add known issue cases support.
      utest.cpp: run the cases with issue seperately.

Zhigang Gong (56):
      CL: Refine the version string handling.
      utest: Query the device driver version and the open cl c version.
      Frexp support global memory directly
      Implement a pyton script to auto generate those builtin vector functions.
      Split the thounsands autogenerated code out from ocl_stdlib header file.
      check whether python is installed.
      Add misc builtin vector functions.
      Fix the indention handling in vector builtin function generator.
      Enable islessgreater/isordered/isunordered builtin vector functions.
      Added memory space parameters support at the autogeneration script.
      Need to define local to __local.
      GBE: refactor double support.
      GBE: enable double vector load/store support.
      GBE: fix insntruction scheduling related bugs in read64/write64.
      GBE: Fix one bug in instruction scheduling.
      Utests: enable long/ulong in vector load/store test case.
      GBE: Fixed a bug and release 2 or 3 simdWidth register space.
      Driver: Fix the incorrect size of surface 1.
      GBE: set temporary address register for read64 to U64.
      GBE: I64CMP should be treated as CMP in reg allocation and insn scheduling.
      GBE: fix an illegal instruction.
      Utests: enable long/ulong for abs_diff test case.
      GBE: disable cl_khr_fp64.
      CL: Refactor cl_mem's implementation.
      Runtime: fix the incorrect platform info size (conformance).
      GBE: don't use flag register as src 1 for xor instruction.
      GBE: add some macros for atom_xxx builtin functions.
      GBE: null register could be used as src1.
      Runtime: clEnqueueMapImage also need to maintain the mapped images.
      Runtime: vendor specified information is required for CL_DEVICE_VERSION/OPENCL_C_VERSION.
      Runtime: initialize single fp mode correctly.
      GBE: We should set no predication/mask for EOT preparation.
      Runtime: fix the max group size for GT2.
      GBE: Support builtin vector functions for select() autogeneration.
      Runtime: fix the incorrect global mem size.
      Utests: Enable bool_cross_basic_block.
      GBE: silent the compilation warning when generate the pch file.
      Runtime: Only return the format allowed in the spec.
      CL: Enalbe gl sharing with new egl extension.
      Runtime: disable some unecessary image formats.
      Runtime: fix a bug when set sampler value.
      Runtime: enable border color state support.
      GBE: check the correct register for whether coord z exists.
      GBE: fixed the broken 3d image support.
      Runtime/driver : implement 3D image support.
      Utests: refine the previous fake 3D test cases.
      Runtime: prepare for CL_MEM_USE_HOST_PTR for image support.
      Runtime: Implement CL_MEM_USE_HOST_PTR flag for image.
      GBE: fixed the store3 bug.
      Refine cmake script file.
      clCopyImage: fix up all the surface type to int type.
      GBE/Runtime: implement workaround for IVB sampler bug
      GBE: Fix the out-of-box checking for normalized coord clamping.
      GBE: refact the curbe register payload allocation.
      GBE: Refine the curbe entry allocation for sampler/image information.
      GBE: sampler_t should always be a const int.

Zou Nan hai (2):
      Flush the queue after enqueue.
      use r112 as source of EOT message

Thanks,
Yang Rong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/beignet/attachments/20131023/36b72631/attachment-0001.html>