[Beignet] [PATCH] fix a long relative regreesion issue on BSW caused by local copy propagation

Yang, Rong R rong.r.yang at intel.com
Wed Oct 21 01:59:40 PDT 2015


Yes, if one instruction span several registers, the other registers' visit pattern is same as first register if the vstride is normal(width * hstride).
And I notify you don't take care of flag register, does it safe to ignore the flag register?
Anyway, I will push it first.

> -----Original Message-----
> From: Guo, Yejun
> Sent: Monday, October 19, 2015 20:19
> To: Guo, Yejun; Yang, Rong R; beignet at lists.freedesktop.org
> Subject: RE: [Beignet] [PATCH] fix a long relative regreesion issue on BSW
> caused by local copy propagation
> 
> I have another idea,  since we allocate the registers in a regular manner, I
> guess the 32bit element can represent the whole register, it just repeats
> regularly. Is there any case not meet my assumption? If no, we can keep the
> current code but add the comments.
> 
> -----Original Message-----
> From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf Of
> Guo, Yejun
> Sent: Monday, October 19, 2015 4:48 PM
> To: Yang, Rong R; beignet at lists.freedesktop.org
> Subject: Re: [Beignet] [PATCH] fix a long relative regreesion issue on BSW
> caused by local copy propagation
> 
> Yes, simd16 + hstrid 4 is not covered, and just get a worse case for vstrid 128,
> 64bit element is not enough too.  Will we allocate such register? If yes, I need
> to wrap it in a structure instead of a uint32/uint64.
> 
> -----Original Message-----
> From: Yang, Rong R
> Sent: Monday, October 19, 2015 4:38 PM
> To: Guo, Yejun; beignet at lists.freedesktop.org
> Cc: Guo, Yejun
> Subject: RE: [Beignet] [PATCH] fix a long relative regreesion issue on BSW
> caused by local copy propagation
> 
> One comment about CalculateElements.
> 
> > -----Original Message-----
> > From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf
> > Of Guo Yejun
> > Sent: Thursday, October 15, 2015 9:48
> > To: beignet at lists.freedesktop.org
> > Cc: Guo, Yejun
> > Subject: [Beignet] [PATCH] fix a long relative regreesion issue on BSW
> > caused by local copy propagation
> >
> > on bsw, for long data type, src and dst hstride must be aligned to the
> > same qword, so disable the copy propagation for following IRs.
> >     MOV(8)              %62<2>:UD	:	%34<8,8,1>:UD
> >     MOV(8)              %42<1>:L	:	%62<8,4,2>:UD
> >
> > and also fix a typo issue in function CalculateElements
> >
> > Signed-off-by: Guo Yejun <yejun.guo at intel.com>
> > ---
> >  backend/src/backend/gen_insn_selection.cpp          | 2 +-
> >  backend/src/backend/gen_insn_selection.hpp          | 2 ++
> >  backend/src/backend/gen_insn_selection_optimize.cpp | 8 +++++++-
> >  3 files changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/backend/src/backend/gen_insn_selection.cpp
> > b/backend/src/backend/gen_insn_selection.cpp
> > index 9714e3e..5125eb3 100644
> > --- a/backend/src/backend/gen_insn_selection.cpp
> > +++ b/backend/src/backend/gen_insn_selection.cpp
> > @@ -2140,7 +2140,7 @@ namespace gbe
> >      this->opaque->setLongRegRestrict(true);
> >      this->opaque->setSlowByteGather(true);
> >      this->opaque->setHasHalfType(true);
> > -    opt_features = SIOF_OP_AND_LOGICAL_SRCMOD;
> > +    opt_features = SIOF_OP_AND_LOGICAL_SRCMOD |
> > + SIOF_OP_MOV_LONG_REG_RESTRICT;
> >    }
> >
> >    Selection9::Selection9(GenContext &ctx) : Selection(ctx) { diff
> > --git a/backend/src/backend/gen_insn_selection.hpp
> > b/backend/src/backend/gen_insn_selection.hpp
> > index f313786..4efb80b 100644
> > --- a/backend/src/backend/gen_insn_selection.hpp
> > +++ b/backend/src/backend/gen_insn_selection.hpp
> > @@ -230,6 +230,8 @@ namespace gbe
> >      //for OP_AND, on BDW+, SrcMod value indicates a logical source
> modifier
> >      //            on PRE-BDW, SrcMod value indicates a numeric source modifier
> >      SIOF_OP_AND_LOGICAL_SRCMOD = 1 << 0,
> > +    //for OP_MOV, on BSW, for long data type, src and dst hstride
> > + must be
> > aligned to the same qword
> > +    SIOF_OP_MOV_LONG_REG_RESTRICT = 1 << 1,
> >    };
> >
> >    /*! Owns the selection engine */
> > diff --git a/backend/src/backend/gen_insn_selection_optimize.cpp
> > b/backend/src/backend/gen_insn_selection_optimize.cpp
> > index 939dd18..fffc8b0 100644
> > --- a/backend/src/backend/gen_insn_selection_optimize.cpp
> > +++ b/backend/src/backend/gen_insn_selection_optimize.cpp
> > @@ -30,7 +30,7 @@ namespace gbe
> >          elements |= (1 << offsetInType);
> Is 32bits for elements enough? For example simd16 and hstride is 4.
> 
> >          offsetInByte += hstride * elementSize;
> >        }
> > -      offsetInByte += vstride * elementSize;
> > +      base += vstride * elementSize;
> >      }
> >      return elements;
> >    }
> > @@ -174,6 +174,12 @@ namespace gbe
> >        if (insn.opcode == SEL_OP_AND && (info->replacement.absolute ||
> > info-
> > >replacement.negation))
> >          return false;
> >
> > +    if (features & SIOF_OP_MOV_LONG_REG_RESTRICT && insn.opcode ==
> > SEL_OP_MOV) {
> > +      const GenRegister& dst = insn.dst(0);
> > +      if (dst.isint64() && !info->replacement.isint64() &&
> > + info->elements !=
> > CalculateElements(info->replacement, insn.state.execWidth))
> > +        return false;
> > +    }
> > +
> >      if (info->replacementOverwritten)
> >        return false;
> >
> > --
> > 1.9.1
> >
> > _______________________________________________
> > Beignet mailing list
> > Beignet at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/beignet
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet


More information about the Beignet mailing list