[Beignet] [PATCH] Update documents.

Zhigang Gong zhigang.gong at intel.com
Mon Jan 20 16:19:27 PST 2014


Signed-off-by: Zhigang Gong <zhigang.gong at intel.com>
---
 docs/Beignet.mdwn              |  110 ++++++++++++++++++++++------------------
 docs/Beignet/Backend/TODO.mdwn |   35 ++++++++++---
 2 files changed, 89 insertions(+), 56 deletions(-)

diff --git a/docs/Beignet.mdwn b/docs/Beignet.mdwn
index 0830004..278441a 100644
--- a/docs/Beignet.mdwn
+++ b/docs/Beignet.mdwn
@@ -9,9 +9,44 @@ the programs and run them on the GPU. The code base also contains the compiler
 part of the stack which is included in `backend/`. For more specific information
 about the compiler, please refer to `backend/README.md`
 
-How to build
+Prerequisite
 ------------
 
+The project depends on the following external libaries:
+
+- Several X components (XLib, Xfixes, Xext)
+- libdrm libraries (libdrm and libdrm\_intel)
+- Various LLVM components
+- The compiler backend itself (libgbe)
+- Mesa git master version built with gbm enabled to support extension cl\_khr\_gl\_sharing.
+
+Note that the compiler depends on LLVM (Low-Level Virtual Machine project).
+Right now, the code has been compiled with LLVM 3.3/3.4. It will not compile
+with any thing older.
+
+[http://llvm.org/releases/](http://llvm.org/releases/)
+
+LLVM 3.3 and 3.4 are supported. Till now, the recommended LLVM version is 3.3.
+There are some severe OpenCL related regression in current clang 3.4 version.
+
+Also note that the code was compiled on GCC 4.6 and GCC 4.7. Since the code uses
+really recent C++11 features, you may expect problems with older compilers. Last
+time I tried, the code breaks ICC 12 and Clang with internal compiler errors
+while compiling anonymous nested lambda functions.
+
+And if you want to work with the standard ICD libOpenCL.so, then you need
+two more packages (the following package name is for Ubuntu):
+- ocl-icd-dev 
+- ocl-icd-libopencl1
+
+If you don't want to enable ICD, or your system doesn't have ICD OpenCL support,
+you can still link to the beignet OpenCL library. You can find the beignet/libcl.so
+in your system's library installation directories.
+
+
+How to build and install
+------------------------
+
 The project uses CMake with three profiles:
 
 1. Debug (-g)
@@ -26,41 +61,28 @@ Basically, from the root directory of the project
 
 `> cmake ../ # to configure`
 
-Choose whatever you want for the build.
-
-Then press 'c' to configure and 'g' to generate the code.
+CMake will check the dependencies and will complain if it does not find them.
 
 `> make`
 
-The project depends on several external libraries:
-
-- Several X components (XLib, Xfixes, Xext)
-- libdrm libraries (libdrm and libdrm\_intel)
-- Various LLVM components
-- The compiler backend itself (libgbe)
-- Mesa git master version built with gbm enabled to support extension cl\_khr\_gl\_sharing.
-
-CMake will check the dependencies and will complain if it does not find them.
-
-The cmake will also build the backend project. Please refer to:
+The cmake will build the backend firstly. Please refer to:
 [[OpenCL Gen Backend|Beignet/Backend]] to get more dependencies.
 
 Once built, the run-time produces a shared object libcl.so which basically
 directly implements the OpenCL API. A set of tests are also produced. They may
 be found in `utests/`.
 
-Note that the compiler depends on LLVM (Low-Level Virtual Machine project).
-Right now, the code has been compiled with LLVM 3.1/3.2. It will not compile
-with any thing older.
-
-[http://llvm.org/releases/](http://llvm.org/releases/)
+Simply invoke:
+`> make install`
 
-LLVM 3.3 and 3.4 are supported.
+It installs the following three files to the beignet/ directory relatively to
+your library installation directory.
+- libcl.so
+- ocl_stdlib.h, ocl_stdlib.h.pch
+- beignet.bc
 
-Also note that the code was compiled on GCC 4.6 and GCC 4.7. Since the code uses
-really recent C++11 features, you may expect problems with older compilers. Last
-time I tried, the code breaks ICC 12 and Clang with internal compiler errors
-while compiling anonymous nested lambda functions.
+It installs the OCL icd vendor files to /etc/OpenCL/vendors, if the system support ICD.
+- intel-beignet.icd
 
 How to run
 ----------
@@ -85,44 +107,33 @@ will run all the unit tests one after the others
 
 will only run `some_unit_test0` and `some_unit_test1` tests
 
-How to install
---------------
-
-Simply invoke:
-`> make install`
-
-It installs libcl.so and the precompiled header/module files and the ocl_stdlib.h file
-into install_prefix/beignet/ direcotry. If the system support ICD, it also installs the
-intel-beignet.icd to /etc/OpenCL/vendors/.
-
-To make beignet support ICD, you need to have the following two packages installed:
-ocl-icd-dev, ocl-icd-libopencl1 (package name for the ubuntu.)
-before your build beignet.
-
 Supported Hardware
 ------------------
 
-The code was tested on IVB GT2 with ubuntu and fedora core distribution.
-Currently Only IVB is supported right now. Actually, the code was only run on IVB GT2. You
-may expect some issues with IVB GT1.
+The code was tested on IVB GT2 with ubuntu and fedora core distribution. The recommended
+kernel version is equal or newer than 3.11. Currently Only IVB is supported right now.
+Actually, the code was run on IVB GT2/GT1, and both system are well supported now.
 
 TODO
 ----
 
-The run-time is far from being complete. Most of the pieces have been put
-together to test and develop the OpenCL compiler. A partial list of things to
-do:
+Interns of the OpenCL 1.1 spec, beignet is quite complete now. We can pass almost
+all the piglit OpenCL test cases now. And the pass rate for the OpenCV test suite
+is also good. There are still some remains work items listed as below, most of them
+are extension support and performance related.
 
 - Complete cl\_khr\_gl\_sharing support. We lack of some APIs implementation such
   as clCreateFromGLBuffer,clCreateFromGLRenderbuffer,clGetGLObjectInfo... Currently,
-  the working APIs are clCreateFromGLTexture,clCreateFromGLTexture2D.
+  the working APIs are clCreateFromGLTexture,clCreateFromGLTexture2D. This work
+  highly depends on mesa support. It seems that mesa would not provide such type
+  of extensions, we may have to hack with mesa source code to support this extension.
 
 - Check that NDRangeKernels can be pushed into _different_ queues from several
   threads.
 
 - No state tracking at all. One batch buffer is created at each "draw call"
   (i.e. for each NDRangeKernels). This is really inefficient since some
-  expensive pipe controls are issued for each batch buffer
+  expensive pipe controls are issued for each batch buffer.
 
 - Valgrind reports some leaks in libdrm. It sounds like a false positive but it
   has to be checked. Idem for LLVM. There is one leak here to check.
@@ -133,7 +144,10 @@ does not comply with the standard or it is just missing)
 
 Project repository
 ------------------
-Right now, we host our project on fdo at: git://anongit.freedesktop.org/beignet.
+Right now, we host our project on fdo at:
+[http://cgit.freedesktop.org/beignet/](http://cgit.freedesktop.org/beignet/).
+And the intel 01.org:
+[https://01.org/beignet](https://01.org/beignet)
 
 The team
 --------
diff --git a/docs/Beignet/Backend/TODO.mdwn b/docs/Beignet/Backend/TODO.mdwn
index adc7fd2..521b739 100644
--- a/docs/Beignet/Backend/TODO.mdwn
+++ b/docs/Beignet/Backend/TODO.mdwn
@@ -1,14 +1,15 @@
 TODO
 ====
 
-The compiler is far from complete. Even if the skeleton is now done and should
-be solid, There are a _lot_ of things to do from trivial to complex.
+The compiler is quite complete now in terms of functionality. It could pass
+almos all of the piglit OCL test cases and the pass rate for the OpenCV test
+suite is also quite good now. But there are plenty of things to do for the
+final performance tuning.
 
 OpenCL standard library
 -----------------------
 
-Today we define the OpenCL API in header file `src/ocl_stdlib.h`. This file is
-from being complete.
+Today we define the OpenCL API in header file `src/ocl_stdlib.h`.
 
 By the way, one question remains: do we want to implement
 the high-precision functions as _inline_ functions or as external functions to
@@ -19,23 +20,29 @@ do both actually.
 LLVM front-end
 --------------
 
-The code is defined in `src/llvm`.  We used the PTX ABI and the OpenCL profile
+The code is defined in `src/llvm`.  We used the SPIR and the OpenCL profile
 to compile the code. Therefore, a good part of the job is already done. However,
 many things must be implemented:
 
-- Lowering down of various intrinsics like `llvm.memcpy`
-
 - Better resolving of the PHI functions. Today, we always generate MOV
   instructions at the end of each basic block . They can be easily optimized.
 
 - From LLVM 3.3, we use SPIR IR. We need to use the compiler defined type to
   represent sampler_t/image2d_t/image1d_t/....
 
+- Considering to use libclc in our project and avoid to use the PCH which is not
+  compatible for different clang versions. And may contribute what we have done in
+  the ocl_stdlib.h to libclc if possible.
+
 Gen IR
 ------
 
 The code is defined in `src/ir`. Main things to do are:
 
+- Implement those llvm.memset/llvm.memcpy more efficiently. Currently, we lower
+  them as normal memcpy at llvm module level and not considering the intrinsics
+  all have a constant data length.
+
 - Finishing the handling of function arguments (see the [[IR
   description|gen_ir]] for more details)
 
@@ -54,6 +61,11 @@ The code is defined in `src/ir`. Main things to do are:
   This will obviously impact both instruction selection and the register
   allocation.
 
+- Implement fast path for small local variables. When the kernel only defines
+  a small local array/variable, there will be a good chance to allocate the local
+  array/variable in register space rather than system memory. This will reduce a
+  lot of memory load/stroe from the system memory.
+
 Backend
 -------
 
@@ -64,7 +76,14 @@ The code is defined in `src/backend`. Main things to do are:
 - Implementing proper instruction selection. A "simple" tree matching algorithm
   should provide good results for Gen
 
-- Improving the instruction scheduling pass
+- Improving the instruction scheduling pass. The current scheduling code has some bugs,
+  we disable it by default currently. We need to fix them in the future.
+
+- Some instructions are introduced in the last code generation stage. We need to
+  introduce a pass after that to eliminate dead instruction or duplicate MOVs and
+  some instructions with zero operands.
+
+- leverage the structured if/endif for branching processing ? 
 
 General plumbing
 ----------------
-- 
1.7.9.5



More information about the Beignet mailing list