[Mesa-dev] [PATCH 4/4] AMDGPU: New control flow for SI

Tom Stellard tom at stellard.net
Mon Dec 10 06:52:12 PST 2012


On Sun, Dec 09, 2012 at 06:02:59PM +0100, Christian König wrote:
> This patch replaces the control flow handling with a new
> pass which structurize the graph before transforming it to
> machine instruction. This has a couple of different advantages
> and currently fixes 20 piglit tests without a single regression.
> 
> It is now a general purpose transformation that could be not
> only be used for SI/R6xx, but also for other hardware
> implementations that use a form of structurized control flow.
> 

Hi Christian,

This looks very good, nice job!  There are just a few style issues with this
last patch, see comments below.

With those fixed, the whole series is:

Reviewed-by: Tom Stellard <thomas.stellard at amd.com>


>  } // End namespace llvm
> diff --git a/lib/Target/AMDGPU/AMDGPUStructurizeCFG.cpp b/lib/Target/AMDGPU/AMDGPUStructurizeCFG.cpp
> new file mode 100644
> index 0000000..6ddb5bf
> --- /dev/null
> +++ b/lib/Target/AMDGPU/AMDGPUStructurizeCFG.cpp
> @@ -0,0 +1,725 @@
> +//===-- AMDGPUStructurizeCFG.cpp -  ------------------===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//

We need to use doxygen style comments on all new files, so this summary comment needs
to be commented with three forward slashes instead of two.  Also, before the file
summary, we need to add:

/// \file


> +// The pass implemented in this file transforms the programs control flow graph
> +// into a form that's suitable for code generation on hardware that implements
> +// control flow by execution masking. This currently includes all AMD GPUs but
> +// may as well be useful for other types of hardware.
> +//===----------------------------------------------------------------------===//
> +
> +#include "AMDGPU.h"
> +#include "llvm/Module.h"
> +#include "llvm/ADT/SCCIterator.h"
> +#include "llvm/Analysis/RegionIterator.h"
> +#include "llvm/Analysis/RegionInfo.h"
> +#include "llvm/Analysis/RegionPass.h"
> +#include "llvm/Transforms/Utils/SSAUpdater.h"
> +
> +using namespace llvm;
> +
> +namespace {
> +
> +// Definition of the complex types used in this pass.
> +
> +typedef std::pair<BasicBlock *, Value *> BBValuePair;
> +typedef ArrayRef<BasicBlock*> BBVecRef;
> +
> +typedef SmallVector<RegionNode*, 8> RNVector;
> +typedef SmallVector<BasicBlock*, 8> BBVector;
> +typedef SmallVector<BBValuePair, 2> BBValueVector;
> +
> +typedef DenseMap<PHINode *, BBValueVector> PhiMap;
> +typedef DenseMap<BasicBlock *, PhiMap> BBPhiMap;
> +typedef DenseMap<BasicBlock *, Value *> BBPredicates;
> +typedef DenseMap<BasicBlock *, BBPredicates> PredMap;
> +typedef DenseMap<BasicBlock *, unsigned> VisitedMap;
> +
> +// The name for newly created blocks.
> +
> +static const char *FlowBlockName = "Flow";
> +
> +// Transforms the control flow graph on one single entry/exit region at a time.
> +//
> +// After the transform all "If"/"Then"/"Else" style control flow looks like
> +// this:
> +//
> +// 1
> +// ||
> +// | |
> +// 2 |
> +// | /
> +// |/   
> +// 3
> +// ||   Where:
> +// | |  1 = "If" block, calculates the condition
> +// 4 |  2 = "Then" subregion, runs if the condition is true
> +// | /  3 = "Flow" blocks, newly inserted flow blocks, rejoins the control flow
> +// |/   4 = "Else" optional subregion, runs if the condition is false
> +// 5    5 = "End" block, also rejoins the control flow
> +//
> +// Control flow is expressed as a branch where the true exit goes into the
> +// "Then"/"Else" region, while the false exit skips the region
> +// The condition for the optional "Else" region is expressed as a PHI node.
> +// The incomming values of the PHI node are true for the "If" edge and false
> +// for the "Then" edge.
> +//
> +// Additionally to that even complicated loops look like this:
> +//
> +// 1
> +// ||
> +// | |
> +// 2 ^  Where:
> +// | /  1 = "Entry" block
> +// |/   2 = "Loop" optional subregion, with all exits at "Flow" block
> +// 3    3 = "Flow" block, with back edge to entry block
> +// |
> +//
> +// The back edge of the "Flow" block is always on the false side of the branch
> +// while the true side continues the general flow. So the loop condition
> +// consist of a network of PHI nodes where the true incoming values expresses
> +// breaks and the false values expresses continue states.

We should use make this a doxygen comment.

> +class AMDGPUStructurizeCFG : public RegionPass {
> +
> +  static char ID;

================================================================================

> +char AMDGPUStructurizeCFG::ID = 0;
> +
> +// Initialize the types and constants used in the pass

All the functions in this file with comments should be converted to use
doxygen style comments.  Also, for simple one line descriptions like this
it is recommended to use the \brief tag.

> +bool AMDGPUStructurizeCFG::doInitialization(Region *R, RGPassManager &RGM) {
> +
> +  LLVMContext &Context = R->getEntry()->getContext();

================================================================================

> +// Collect various loop and predicate infos
> +void AMDGPUStructurizeCFG::collectInfos()
> +{

Brace needs to go on same line as function.

> +  unsigned Number = 0, LoopIdx = ~0;
> +
> +  // Reset predicate

================================================================================

> +// Create the pass
> +Pass *llvm::createAMDGPUStructurizeCFGPass()
> +{

Brace needs to go on same line as function.

> +  return new AMDGPUStructurizeCFG();
> +}

================================================================================

> diff --git a/lib/Target/AMDGPU/SIAnnotateControlFlow.cpp b/lib/Target/AMDGPU/SIAnnotateControlFlow.cpp
> new file mode 100644
> index 0000000..d1cc0b8
> --- /dev/null
> +++ b/lib/Target/AMDGPU/SIAnnotateControlFlow.cpp
> @@ -0,0 +1,333 @@
> +//===-- SIAnnotateControlFlow.cpp -  ------------------===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +// Annotates the control flow with hardware specific intrinsics.

Same as AMDGPUStructurizeCFG.cpp, use doxygen comments here and on all functions
with comments.

> +//===----------------------------------------------------------------------===//
> +
> +#include "AMDGPU.h"

================================================================================

> +// Create the annotation pass
> +FunctionPass *llvm::createSIAnnotateControlFlowPass()
> +{

Brace needs to go on same line as function.

> +  return new SIAnnotateControlFlow();
> +}

================================================================================

> diff --git a/lib/Target/AMDGPU/SIISelLowering.cpp b/lib/Target/AMDGPU/SIISelLowering.cpp
> index 292ce85..599c526 100644
> --- a/lib/Target/AMDGPU/SIISelLowering.cpp
> +++ b/lib/Target/AMDGPU/SIISelLowering.cpp
> +
> +// This transforms the control flow intrinsics to get the branch destination as
> +// last parameter, also switches branch target with BR if the need arise

Need doxygen comments.

> +SDValue SITargetLowering::LowerBRCOND(SDValue BRCOND,
> +                                      SelectionDAG &DAG) const {
> +

================================================================================

> diff --git a/lib/Target/AMDGPU/SILowerControlFlow.cpp b/lib/Target/AMDGPU/SILowerControlFlow.cpp
> index 277b647..594e6f7 100644
> --- a/lib/Target/AMDGPU/SILowerControlFlow.cpp
> +++ b/lib/Target/AMDGPU/SILowerControlFlow.cpp
> +void SILowerControlFlowPass::Else(MachineInstr &MI) {
> +
> +  MachineBasicBlock &MBB = *MI.getParent();
> +  DebugLoc DL = MI.getDebugLoc();
> +  unsigned Dst = MI.getOperand(0).getReg();
> +  unsigned Src = MI.getOperand(1).getReg();
> +
> +  BuildMI(MBB, &MI, DL, TII->get(AMDGPU::S_OR_SAVEEXEC_B64), Dst)
> +          .addReg(Src); // Saved EXEC
> +
> +  BuildMI(MBB, &MI, DL, TII->get(AMDGPU::S_XOR_B64), AMDGPU::EXEC)
> +          .addReg(AMDGPU::EXEC)
> +          .addReg(Dst);
> +
> +  //BuildMI(MBB, I, MBB.findDebugLoc(I), TII->get(AMDGPU::S_CBRANCH_EXECZ))
> +  //        .addOperand(MI.getOperand(2));
> +

This commented code needs to be removed or have a comment explaining why it's
commented out.

> +  MI.eraseFromParent();
> +}
> +

================================================================================


More information about the mesa-dev mailing list