[Beignet] [PATCH V2 3/3] add local copy propagation optimization for each basic block

Zhigang Gong zhigang.gong at linux.intel.com
Wed Sep 23 22:12:17 PDT 2015


On Thu, Sep 24, 2015 at 06:05:31AM +0000, Guo, Yejun wrote:
> 
> 
> -----Original Message-----
> From: Zhigang Gong [mailto:zhigang.gong at linux.intel.com] 
> Sent: Thursday, September 24, 2015 12:32 PM
> To: Guo, Yejun
> Cc: beignet at lists.freedesktop.org
> Subject: Re: [Beignet] [PATCH V2 3/3] add local copy propagation optimization for each basic block
> 
> On Thu, Sep 24, 2015 at 02:58:22AM +0000, Guo, Yejun wrote:
> > 
> > > +
> > > +  void SelBasicBlockOptimizer::changeInsideReplaceInfos(const
> > > + SelectionInstruction& insn, GenRegister& var)  {
> > > +    for (ReplaceInfo* info : replaceInfos) {
> > > +      if (info->intermedia.reg() == var.reg()) {
> A new comment here is that the above loop is a little bit too heavy.
> For a large BB which has many MOV instructions, it will iterate too many times for instructions after those MOV instructions. A better way is to change to use a new map type of replaceInfos:
> map <ir::Register, set<ReplaceInfo*>>
> 
> This will be much faster than iterate all infos for each instruction.
> 
> [Yejun] nice, I'll change to map.
> 
> > > +        bool replaceable = false;
> > It's better to add a comment here to describe the cases which can't be replaced and why.
> > 
> > [yejun] actually, I think the code itself explains something, it is much complex to explain every detail in human words. I consider it as a nice-to-have since the basic logic is simple.
> I still think it is not that simple. A case just came into my mind which we can't do replacement is:
> 
> MOV %r0, %r1
> ADD %r1, %r1, 1
> ...
> ADD %r10, %r0, 1
> ...
> 
> I'm afraid that your current code can't deal it correctly, right?
> 
> [yejun] current code does nothing for these instructions. It also relatives to constant, maybe add another optimization path to keep every path clear and simple. Or we can consider current code as a basic, and to extend it when find such optimization is necessary.
If my understanding is correct, current code will replace %r0 to %r1 in
the second ADD instruction which breaks the code. This is not an optimize
opportunity, but a bug and must be fixed. The root cause is %r1 has been
modified after copy its value to another register, then we should not
propagate it to the destination register after that modification. For example,
for all instruction after
ADD %r1, %r1, 1,
We could not do the 's/r0/r1'.

> 
> > 
> > > +            uint32_t elements = CalculateElements(var, insn.state.execWidth);
> > > +            if (info->elements == elements) {
> > Why not compare each attribute value, such as vstride_size,hstride_size,.....?
> > 
> > [Yejun] the execWidth is not recorded in the GenRegister, and I once saw instructions such as:
> > mov(1)  dst (vstride_size or hstride_size is 1/2/... or something 
> > else), src
> > mov(16) anotherdst, dst (with stride as 0)
> > 
> > the dst here is the same, but the strides are different, to handle this case, I add the function CalculateElements.
> The example you gave here has two different GenRegisters as GenRegister has vstride and hstride members. Right? My previous comment suggests to compare two GenRegistes' attributs and I can't understand your point here.
> 
> [Yejun] Let's use the following SEL IR as an example, %42 is the same register but with two different stride in the two IRs. 
> MOV(1)              %42<2>:D	:	%41<0,1,0>:D
> MOV(16)           %43<1>:D   	:     	%42<0,1,0>:D
This doesn't address my comment. As the comment suggests to compare the GenRegisters' attribute
not the virtual register. You can easily found the above two GenRegisters are different.

> 
> 
> Thanks,
> Zhigang Gong.
> 
> > 
> > 
> > 
> > 
> > The other parts LGTM,
> > 
> > Thanks,
> > Zhigang Gong


More information about the Beignet mailing list