[Beignet] [PATCH] GBE: Place loop exits after loop blocks when sorting basic blocks.

Zhigang Gong zhigang.gong at linux.intel.com
Tue Nov 25 20:44:53 PST 2014


Right, A normal instruction should have no more than 2 successors.
A ">=2" is a little bit confusing. I will simply remove it and
add an assertion after get the child number to make sure the
childNo is less or equal to 2 if no objection.

On Wed, Nov 26, 2014 at 02:57:01AM +0000, Yang, Rong R wrote:
> Implement the post-order liked traversal, LGTM.
> Because you suppose always less equal than 2 successors, I think it's better to add a childNo assert.
> 
> > -----Original Message-----
> > From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf Of
> > Ruiling Song
> > Sent: Tuesday, November 25, 2014 13:58
> > To: beignet at lists.freedesktop.org
> > Cc: Song, Ruiling
> > Subject: [Beignet] [PATCH] GBE: Place loop exits after loop blocks when
> > sorting basic blocks.
> > 
> > This again is to solve register liveness issue. Details see comment inline.
> > 
> > This could fix opencv failure under strict conformance mode:
> > ./opencv_test_core --
> > gtest_filter=OCL_Arithm/PolarToCart.angleInRadians/0
> > 
> > Signed-off-by: Ruiling Song <ruiling.song at intel.com>
> > ---
> >  backend/src/llvm/llvm_gen_backend.cpp |   92
> > +++++++++++++++++++++++++++++----
> >  1 file changed, 82 insertions(+), 10 deletions(-)
> > 
> > diff --git a/backend/src/llvm/llvm_gen_backend.cpp
> > b/backend/src/llvm/llvm_gen_backend.cpp
> > index 558491f..939dd63 100644
> > --- a/backend/src/llvm/llvm_gen_backend.cpp
> > +++ b/backend/src/llvm/llvm_gen_backend.cpp
> > @@ -1941,6 +1941,25 @@ namespace gbe
> >        fn.addLoop(loopBBs, loopExits);
> >      }
> >    }
> > +
> > +
> > +  static unsigned getChildNo(BasicBlock *bb) {
> > +    TerminatorInst *term = bb->getTerminator();
> > +    return term->getNumSuccessors();
> > +  }
> > +
> > +  // return NULL if index out-range of children number  static
> > + BasicBlock *getChildPossible(BasicBlock *bb, unsigned index) {
> > +
> > +    TerminatorInst *term = bb->getTerminator();
> > +    unsigned childNo = term->getNumSuccessors();
> > +    BasicBlock *child = NULL;
> > +    if(index < childNo) {
> > +      child = term->getSuccessor(index);
> > +    }
> > +    return child;
> > +  }
> > +
> >  /*!
> > 
> >    Sorting Basic blocks is mainly used to solve register liveness issue, take a
> > @@ -1958,7 +1977,7 @@ namespace gbe
> >    |
> >     -->7
> > 
> > -  A register %10 defined in bb4, and used in bb5 & bb6. In normal liveness
> > +  1.) A register %10 defined in bb4, and used in bb5 & bb6. In normal
> > + liveness
> >    analysis, %10 is not alive in bb3. But under simd execution model, after
> >    executing bb4, some channel jump through bb5 to bb3, other channel may
> > jump
> >    to bb6, we must execute bb3 first, then bb6, to avoid missing instructions.
> > @@ -1969,25 +1988,78 @@ namespace gbe
> >    we can see the bb3 will be placed after bb5 & bb6. The liveness calculation
> >    is just as normal and will be correct.
> > 
> > -  Another advantage of sorting basic blocks is reducing register pressure.
> > +  2.) Another advantage of sorting basic blocks is reducing register pressure.
> >    In the above CFG, a register defined in bb3 and used in bb7 will be
> >    alive through 3,4,5,6,7. But in fact it should be only alive in bb3 and bb7.
> >    After topological sorting, this kind of register would be only alive in bb3
> >    and bb7. Register pressure in 4,5,6 is reduced.
> > -*/
> > 
> > +  3.) Classical post-order traversal will automatically choose a order
> > + for the  successors of a basic block, But this order may be hard to
> > + handle, take a look  at below CFG:
> > +
> > +       1 <-----
> > +      /        |
> > +      2 --> 4 -
> > +      |
> > +      3
> > +      |
> > +      5
> > +  In the post oder traversal, it may be: 5->4->3->2->1, as 4, 3 does
> > +not have
> > +  strict order. This is a serious issue, a value defined in bb3, used
> > +in bb5
> > +  may be overwritten in bb1. Remember the simd execution model? some
> > +lanes
> > +  may execute bb4 after other lanes finish bb3, and then jump to bb1,
> > +but live
> > +  range of the register does not cover bb1. what we done here is for a
> > +loop
> > +  exit (here bb3), we alwasy make sure it is visited first in the
> > +post-order
> > +  traversal, for the graph, that means 5->3->4->2->1. Then a definition
> > +in bb3,
> > +  and used in 5 will not interfere with any other values defined in the loop.
> > +*/
> >    void GenWriter::sortBasicBlock(Function &F) {
> > -    typedef ReversePostOrderTraversal<Function*> RPOTType;
> > -    RPOTType rpot(&F);
> > -    Function::BasicBlockListType &bbList = F.getBasicBlockList();
> > +    BasicBlock &entry = F.getEntryBlock();
> > +    std::vector<BasicBlock *> visitStack;
> > +    std::vector<BasicBlock *> sorted;
> > +    std::set<BasicBlock *> visited;
> > +
> > +    visitStack.push_back(&entry);
> > +    visited.insert(&entry);
> > +
> > +    while (!visitStack.empty()) {
> > +      BasicBlock *top = visitStack.back();
> > +      unsigned childNo = getChildNo(top);
> > +
> > +      BasicBlock *child0 = getChildPossible(top, 0);
> > +      BasicBlock *child1 = getChildPossible(top, 1);
> > +      if(childNo >= 2) {
> > +        Loop *loop = LI->getLoopFor(top);
> > +        // visit loop exit node first, so loop exit block will be placed
> > +        // after blocks in loop in 'reverse post-order' list.
> > +        if (loop && loop->contains(child0) && !loop->contains(child1)) {
> > +          BasicBlock *tmp = child0; child0 = child1; child1 = tmp;
> > +        }
> > +      }
> > +
> > +      if (child0 != NULL && visited.find(child0) == visited.end()) {
> > +        visitStack.push_back(child0);
> > +        visited.insert(child0);
> > +      } else if (child1 != NULL && visited.find(child1) == visited.end()) {
> > +        visitStack.push_back(child1);
> > +        visited.insert(child1);
> > +      } else {
> > +        sorted.push_back(visitStack.back());
> > +        visitStack.pop_back();
> > +      }
> > +    }
> > 
> > -    for (RPOTType::rpo_iterator bbI = rpot.begin(), bbE = rpot.end(); bbI !=
> > bbE; ++bbI) {
> > -      (*bbI)->removeFromParent();
> > +    Function::BasicBlockListType &bbList = F.getBasicBlockList();
> > +    for (std::vector<BasicBlock *>::iterator iter = sorted.begin(); iter !=
> > sorted.end(); ++iter) {
> > +      (*iter)->removeFromParent();
> >      }
> > -    for (RPOTType::rpo_iterator bbI = rpot.begin(), bbE = rpot.end(); bbI !=
> > bbE; ++bbI) {
> > -      bbList.push_back(*bbI);
> > +
> > +    for (std::vector<BasicBlock *>::reverse_iterator iter = sorted.rbegin();
> > iter != sorted.rend(); ++iter) {
> > +      bbList.push_back(*iter);
> >      }
> >    }
> > +
> >    void GenWriter::emitFunction(Function &F)
> >    {
> >      switch (F.getCallingConv()) {
> > --
> > 1.7.10.4
> > 
> > _______________________________________________
> > Beignet mailing list
> > Beignet at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/beignet
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet


More information about the Beignet mailing list