[Beignet] [PATCH V2 3/3] add local copy propagation optimization for each basic block

Guo, Yejun yejun.guo at intel.com
Thu Sep 24 04:02:53 PDT 2015



> > > > +
> > > > +  void SelBasicBlockOptimizer::changeInsideReplaceInfos(const
> > > > + SelectionInstruction& insn, GenRegister& var)  {
> > > > +    for (ReplaceInfo* info : replaceInfos) {
> > > > +      if (info->intermedia.reg() == var.reg()) {
> > A new comment here is that the above loop is a little bit too heavy.
> > For a large BB which has many MOV instructions, it will iterate too many times for instructions after those MOV instructions. A better way is to change to use a new map type of replaceInfos:
> > map <ir::Register, set<ReplaceInfo*>>
> > 
> > This will be much faster than iterate all infos for each instruction.
> > 
> > [Yejun] nice, I'll change to map.
> > 


Look into the code again, and found the following check is interesting and not simple at all :)

+        if (insn.opcode != SEL_OP_BSWAP && !insn.isWrite()) {

1. Why BSWAP is special here? The root cause may be that the src of bswap will be modified.
   But should this be handled in the BSWAP instruction already (I mean, both src and
   dst of BSWAP should be put into  src array and dst array ).
   Unfortunately, BSWAP seems have a bug here and may lead to incorrect scheduling latter.
   That should be fixed in BSWAP, then BSWAP check should be removed from here.

[yejun] agree. Actually, there are two methods here in general, the first is looking into detail and fix the prerequisite  issue first, the second is to skip the prerequisite issue (bswap/iswrite) and finish the main purpose (local copy propagation base) first, the issue will be reviewed again when we do optimization for a specific kernel. I choose the second one as my direction for this patch set.

2. Why !insn.isWrite() is here?
   This is more difficult than the first one. My understanding is that a register in selection
   vector should not be replaced. Because, that will bring extra complexity to manipulate the
   selection vectors. But the isWrite() checking doesn't cover all the cases. Actually,
   nearly all read instructions have one selection source vector, although many of them have only
   1 element. For example, the SAMPLE instruction has a source selection vector.

[yejun] I agree !insn.isWrite() is not enough. Actually, I make a very loose condition here according to the result of utest. I'm expecting there might be more conditions, there might be regression issues with the current condition. I'm unable to find all the conditions now, so want the regression issue to educate me.

[Yejun] Btw, is there a way to check if it is 'a register in selection vector', instead of the opcode check? Thanks.



> > [Yejun] Let's use the following SEL IR as an example, %42 is the same register but with two different stride in the two IRs. 
> > MOV(1)              %42<2>:D	:	%41<0,1,0>:D
> > MOV(16)           %43<1>:D   	:     	%42<0,1,0>:D
> This doesn't address my comment. As the comment suggests to compare the GenRegisters' attribute not the virtual register. You can easily found the above two GenRegisters are different.
> 
> [Yejun] My expectation is that %42 should be considered as the same, and to optimize the IR as "mov(16) %43<1>:D, %41<0,1,0>:D". The optimization chance is lost if we compare the width/stride of %42 and fount they are not the same.

The clear logic here is that the assignment at intermedia instruction should cover the current replacement's usage. So the clear way to check that is to implement the following function:

bool IsSrcCoveredByIntermedia(GenRegister src, int curExecWidth, GenRegister intermedia, int intermediaExecWidth) { }

Before we do the above check, another thing we need to take into consideration which is the predication and mask of the current instruction and the intermedia instruction.

intermedia                src                       replacable
1)noMask = 1               all                        yes
2)GEN_PREDICATE_NORMAL     GEN_PREDICATE_NORMAL       yes
3)GEN_PREDICATE_NORMAL     !GEN_PREDICATE_NORMAL      yes
4)!GEN_PREDICATE_NORMAL    all                        no (could check pred flag further, but not worth to do)
[Yejun] thanks, these info are what I expected regression issues will educate me. Btw, is there a typo for item 2) and 3), looks that they can be merged into one case, while you are listing them as two cases.


So we may change the above function to:
bool IsSrcCoveredByIntermedia(GenRegister src, SelectionInstruction *curInsn, GenRegister intermedia, SelectionInstruction *intermediaInsn) { }

Do all the check in this function and give a result whether it's replacable.

[Yejun] agree to encapsulate all the logic into a new function such as CanBeReplaced(). I do not use 'cover' here, for the case that src is part of intermedia, extra logic is needed for the replacement. It can be added as an extended feature in the real optimization work, if necessary.

BTW, as a reviewer, I really love more clear comment which could reduce tons of time to get to the point :).

[Yejun] and thanks for your time reviewing and discussing, it really helps the code improvement.

Thanks,
Zhigang Gong.

> 
> > 
> > 
> > Thanks,
> > Zhigang Gong.
> > 
> > > 
> > > 
> > > 
> > > 
> > > The other parts LGTM,
> > > 
> > > Thanks,
> > > Zhigang Gong


More information about the Beignet mailing list