[Beignet] [PATCH] GBE: Place loop exits after loop blocks when sorting basic blocks.

Yang, Rong R rong.r.yang at intel.com
Tue Nov 25 18:57:01 PST 2014


Implement the post-order liked traversal, LGTM.
Because you suppose always less equal than 2 successors, I think it's better to add a childNo assert.

> -----Original Message-----
> From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf Of
> Ruiling Song
> Sent: Tuesday, November 25, 2014 13:58
> To: beignet at lists.freedesktop.org
> Cc: Song, Ruiling
> Subject: [Beignet] [PATCH] GBE: Place loop exits after loop blocks when
> sorting basic blocks.
> 
> This again is to solve register liveness issue. Details see comment inline.
> 
> This could fix opencv failure under strict conformance mode:
> ./opencv_test_core --
> gtest_filter=OCL_Arithm/PolarToCart.angleInRadians/0
> 
> Signed-off-by: Ruiling Song <ruiling.song at intel.com>
> ---
>  backend/src/llvm/llvm_gen_backend.cpp |   92
> +++++++++++++++++++++++++++++----
>  1 file changed, 82 insertions(+), 10 deletions(-)
> 
> diff --git a/backend/src/llvm/llvm_gen_backend.cpp
> b/backend/src/llvm/llvm_gen_backend.cpp
> index 558491f..939dd63 100644
> --- a/backend/src/llvm/llvm_gen_backend.cpp
> +++ b/backend/src/llvm/llvm_gen_backend.cpp
> @@ -1941,6 +1941,25 @@ namespace gbe
>        fn.addLoop(loopBBs, loopExits);
>      }
>    }
> +
> +
> +  static unsigned getChildNo(BasicBlock *bb) {
> +    TerminatorInst *term = bb->getTerminator();
> +    return term->getNumSuccessors();
> +  }
> +
> +  // return NULL if index out-range of children number  static
> + BasicBlock *getChildPossible(BasicBlock *bb, unsigned index) {
> +
> +    TerminatorInst *term = bb->getTerminator();
> +    unsigned childNo = term->getNumSuccessors();
> +    BasicBlock *child = NULL;
> +    if(index < childNo) {
> +      child = term->getSuccessor(index);
> +    }
> +    return child;
> +  }
> +
>  /*!
> 
>    Sorting Basic blocks is mainly used to solve register liveness issue, take a
> @@ -1958,7 +1977,7 @@ namespace gbe
>    |
>     -->7
> 
> -  A register %10 defined in bb4, and used in bb5 & bb6. In normal liveness
> +  1.) A register %10 defined in bb4, and used in bb5 & bb6. In normal
> + liveness
>    analysis, %10 is not alive in bb3. But under simd execution model, after
>    executing bb4, some channel jump through bb5 to bb3, other channel may
> jump
>    to bb6, we must execute bb3 first, then bb6, to avoid missing instructions.
> @@ -1969,25 +1988,78 @@ namespace gbe
>    we can see the bb3 will be placed after bb5 & bb6. The liveness calculation
>    is just as normal and will be correct.
> 
> -  Another advantage of sorting basic blocks is reducing register pressure.
> +  2.) Another advantage of sorting basic blocks is reducing register pressure.
>    In the above CFG, a register defined in bb3 and used in bb7 will be
>    alive through 3,4,5,6,7. But in fact it should be only alive in bb3 and bb7.
>    After topological sorting, this kind of register would be only alive in bb3
>    and bb7. Register pressure in 4,5,6 is reduced.
> -*/
> 
> +  3.) Classical post-order traversal will automatically choose a order
> + for the  successors of a basic block, But this order may be hard to
> + handle, take a look  at below CFG:
> +
> +       1 <-----
> +      /        |
> +      2 --> 4 -
> +      |
> +      3
> +      |
> +      5
> +  In the post oder traversal, it may be: 5->4->3->2->1, as 4, 3 does
> +not have
> +  strict order. This is a serious issue, a value defined in bb3, used
> +in bb5
> +  may be overwritten in bb1. Remember the simd execution model? some
> +lanes
> +  may execute bb4 after other lanes finish bb3, and then jump to bb1,
> +but live
> +  range of the register does not cover bb1. what we done here is for a
> +loop
> +  exit (here bb3), we alwasy make sure it is visited first in the
> +post-order
> +  traversal, for the graph, that means 5->3->4->2->1. Then a definition
> +in bb3,
> +  and used in 5 will not interfere with any other values defined in the loop.
> +*/
>    void GenWriter::sortBasicBlock(Function &F) {
> -    typedef ReversePostOrderTraversal<Function*> RPOTType;
> -    RPOTType rpot(&F);
> -    Function::BasicBlockListType &bbList = F.getBasicBlockList();
> +    BasicBlock &entry = F.getEntryBlock();
> +    std::vector<BasicBlock *> visitStack;
> +    std::vector<BasicBlock *> sorted;
> +    std::set<BasicBlock *> visited;
> +
> +    visitStack.push_back(&entry);
> +    visited.insert(&entry);
> +
> +    while (!visitStack.empty()) {
> +      BasicBlock *top = visitStack.back();
> +      unsigned childNo = getChildNo(top);
> +
> +      BasicBlock *child0 = getChildPossible(top, 0);
> +      BasicBlock *child1 = getChildPossible(top, 1);
> +      if(childNo >= 2) {
> +        Loop *loop = LI->getLoopFor(top);
> +        // visit loop exit node first, so loop exit block will be placed
> +        // after blocks in loop in 'reverse post-order' list.
> +        if (loop && loop->contains(child0) && !loop->contains(child1)) {
> +          BasicBlock *tmp = child0; child0 = child1; child1 = tmp;
> +        }
> +      }
> +
> +      if (child0 != NULL && visited.find(child0) == visited.end()) {
> +        visitStack.push_back(child0);
> +        visited.insert(child0);
> +      } else if (child1 != NULL && visited.find(child1) == visited.end()) {
> +        visitStack.push_back(child1);
> +        visited.insert(child1);
> +      } else {
> +        sorted.push_back(visitStack.back());
> +        visitStack.pop_back();
> +      }
> +    }
> 
> -    for (RPOTType::rpo_iterator bbI = rpot.begin(), bbE = rpot.end(); bbI !=
> bbE; ++bbI) {
> -      (*bbI)->removeFromParent();
> +    Function::BasicBlockListType &bbList = F.getBasicBlockList();
> +    for (std::vector<BasicBlock *>::iterator iter = sorted.begin(); iter !=
> sorted.end(); ++iter) {
> +      (*iter)->removeFromParent();
>      }
> -    for (RPOTType::rpo_iterator bbI = rpot.begin(), bbE = rpot.end(); bbI !=
> bbE; ++bbI) {
> -      bbList.push_back(*bbI);
> +
> +    for (std::vector<BasicBlock *>::reverse_iterator iter = sorted.rbegin();
> iter != sorted.rend(); ++iter) {
> +      bbList.push_back(*iter);
>      }
>    }
> +
>    void GenWriter::emitFunction(Function &F)
>    {
>      switch (F.getCallingConv()) {
> --
> 1.7.10.4
> 
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet


More information about the Beignet mailing list