[Beignet] [PATCH] GBE: Do topological sorting of basicblocks.
Ruiling Song
ruiling.song at intel.com
Sun Nov 9 19:31:34 PST 2014
Toplogical sorting have two big advantages:
1. Sorted basicblocks will reduce unneccesary register pressure.
2. Sorted basicblocks will make liveness analysis easier.
This patch fix opencv failures:
./opencv_test_video --gtest_filter=OCL_OCL_Video/Mog2_Update.Accuracy/1
./opencv_test_imgproc --gtest_filter=OCL_ImageProc/Filter2D.Mat/482
Signed-off-by: Ruiling Song <ruiling.song at intel.com>
---
backend/src/llvm/llvm_gen_backend.cpp | 53 ++++++++++++++++++++++++++++++++-
1 file changed, 52 insertions(+), 1 deletion(-)
diff --git a/backend/src/llvm/llvm_gen_backend.cpp b/backend/src/llvm/llvm_gen_backend.cpp
index b1b9382..cb0511a 100644
--- a/backend/src/llvm/llvm_gen_backend.cpp
+++ b/backend/src/llvm/llvm_gen_backend.cpp
@@ -112,6 +112,7 @@
#include "llvm/Target/Mangler.h"
#endif
+#include "llvm/ADT/PostOrderIterator.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/MC/MCAsmInfo.h"
#include "llvm/MC/MCContext.h"
@@ -542,6 +543,8 @@ namespace gbe
void allocateGlobalVariableRegister(Function &F);
/*! gather all the loops in the function and add them to ir::Function */
void gatherLoopInfo(ir::Function &fn);
+ /*! do topological sorting of basicblocks */
+ void sortBasicBlock(Function &F);
/*! Emit the complete function code and declaration */
void emitFunction(Function &F);
/*! Handle input and output function parameters */
@@ -1893,7 +1896,53 @@ namespace gbe
fn.addLoop(loopBBs, loopExits);
}
}
-
+/*!
+
+ Sorting Basic blocks is mainly used to solve register liveness issue, take a
+ look at below CFG:
+
+ -<--1--
+ | |
+ | ->2
+ -- 3 <--- |
+ | ^ | -->4--
+ | | | | |
+ | | -----5<-- |
+ | | |
+ | ----------6<-----
+ |
+ -->7
+
+ A register %10 defined in bb4, and used in bb5 & bb6. In normal liveness
+ analysis, %10 is not alive in bb3. But under simd execution model, after
+ executing bb4, some channel jump through bb5 to bb3, other channel may jump
+ to bb6, we must execute bb3 first, then bb6, to avoid missing instructions.
+ The physical register of %10 was assigned some value in bb4, but when
+ executing bb3, its content may be over-written as it is dead in bb3. When
+ jumping back to execute bb6, it will get polluted data. What a disaster!
+ What we do here is do a topological sorting of basic blocks, For this case
+ we can see the bb3 will be placed after bb5 & bb6. The liveness calculation
+ is just as normal and will be correct.
+
+ Another advantage of sorting basic blocks is reducing register pressure.
+ In the above CFG, a register defined in bb3 and used in bb7 will be
+ alive through 3,4,5,6,7. But in fact it should be only alive in bb3 and bb7.
+ After topological sorting, this kind of register would be only alive in bb3
+ and bb7. Register pressure in 4,5,6 is reduced.
+*/
+
+ void GenWriter::sortBasicBlock(Function &F) {
+ typedef ReversePostOrderTraversal<Function*> RPOTType;
+ RPOTType rpot(&F);
+ Function::BasicBlockListType &bbList = F.getBasicBlockList();
+
+ for (RPOTType::rpo_iterator bbI = rpot.begin(), bbE = rpot.end(); bbI != bbE; ++bbI) {
+ (*bbI)->removeFromParent();
+ }
+ for (RPOTType::rpo_iterator bbI = rpot.begin(), bbE = rpot.end(); bbI != bbE; ++bbI) {
+ bbList.push_back(*bbI);
+ }
+ }
void GenWriter::emitFunction(Function &F)
{
switch (F.getCallingConv()) {
@@ -1915,6 +1964,8 @@ namespace gbe
this->emitFunctionPrototype(F);
this->allocateGlobalVariableRegister(F);
+
+ sortBasicBlock(F);
// Visit all the instructions and emit the IR registers or the value to
// value mapping when a new register is not needed
pass = PASS_EMIT_REGISTERS;
--
1.7.10.4
More information about the Beignet
mailing list