[Beignet] [PATCH] fix a long relative regreesion issue on BSW caused by local copy propagation
Yang, Rong R
rong.r.yang at intel.com
Wed Oct 21 01:59:40 PDT 2015
Yes, if one instruction span several registers, the other registers' visit pattern is same as first register if the vstride is normal(width * hstride).
And I notify you don't take care of flag register, does it safe to ignore the flag register?
Anyway, I will push it first.
> -----Original Message-----
> From: Guo, Yejun
> Sent: Monday, October 19, 2015 20:19
> To: Guo, Yejun; Yang, Rong R; beignet at lists.freedesktop.org
> Subject: RE: [Beignet] [PATCH] fix a long relative regreesion issue on BSW
> caused by local copy propagation
>
> I have another idea, since we allocate the registers in a regular manner, I
> guess the 32bit element can represent the whole register, it just repeats
> regularly. Is there any case not meet my assumption? If no, we can keep the
> current code but add the comments.
>
> -----Original Message-----
> From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf Of
> Guo, Yejun
> Sent: Monday, October 19, 2015 4:48 PM
> To: Yang, Rong R; beignet at lists.freedesktop.org
> Subject: Re: [Beignet] [PATCH] fix a long relative regreesion issue on BSW
> caused by local copy propagation
>
> Yes, simd16 + hstrid 4 is not covered, and just get a worse case for vstrid 128,
> 64bit element is not enough too. Will we allocate such register? If yes, I need
> to wrap it in a structure instead of a uint32/uint64.
>
> -----Original Message-----
> From: Yang, Rong R
> Sent: Monday, October 19, 2015 4:38 PM
> To: Guo, Yejun; beignet at lists.freedesktop.org
> Cc: Guo, Yejun
> Subject: RE: [Beignet] [PATCH] fix a long relative regreesion issue on BSW
> caused by local copy propagation
>
> One comment about CalculateElements.
>
> > -----Original Message-----
> > From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf
> > Of Guo Yejun
> > Sent: Thursday, October 15, 2015 9:48
> > To: beignet at lists.freedesktop.org
> > Cc: Guo, Yejun
> > Subject: [Beignet] [PATCH] fix a long relative regreesion issue on BSW
> > caused by local copy propagation
> >
> > on bsw, for long data type, src and dst hstride must be aligned to the
> > same qword, so disable the copy propagation for following IRs.
> > MOV(8) %62<2>:UD : %34<8,8,1>:UD
> > MOV(8) %42<1>:L : %62<8,4,2>:UD
> >
> > and also fix a typo issue in function CalculateElements
> >
> > Signed-off-by: Guo Yejun <yejun.guo at intel.com>
> > ---
> > backend/src/backend/gen_insn_selection.cpp | 2 +-
> > backend/src/backend/gen_insn_selection.hpp | 2 ++
> > backend/src/backend/gen_insn_selection_optimize.cpp | 8 +++++++-
> > 3 files changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/backend/src/backend/gen_insn_selection.cpp
> > b/backend/src/backend/gen_insn_selection.cpp
> > index 9714e3e..5125eb3 100644
> > --- a/backend/src/backend/gen_insn_selection.cpp
> > +++ b/backend/src/backend/gen_insn_selection.cpp
> > @@ -2140,7 +2140,7 @@ namespace gbe
> > this->opaque->setLongRegRestrict(true);
> > this->opaque->setSlowByteGather(true);
> > this->opaque->setHasHalfType(true);
> > - opt_features = SIOF_OP_AND_LOGICAL_SRCMOD;
> > + opt_features = SIOF_OP_AND_LOGICAL_SRCMOD |
> > + SIOF_OP_MOV_LONG_REG_RESTRICT;
> > }
> >
> > Selection9::Selection9(GenContext &ctx) : Selection(ctx) { diff
> > --git a/backend/src/backend/gen_insn_selection.hpp
> > b/backend/src/backend/gen_insn_selection.hpp
> > index f313786..4efb80b 100644
> > --- a/backend/src/backend/gen_insn_selection.hpp
> > +++ b/backend/src/backend/gen_insn_selection.hpp
> > @@ -230,6 +230,8 @@ namespace gbe
> > //for OP_AND, on BDW+, SrcMod value indicates a logical source
> modifier
> > // on PRE-BDW, SrcMod value indicates a numeric source modifier
> > SIOF_OP_AND_LOGICAL_SRCMOD = 1 << 0,
> > + //for OP_MOV, on BSW, for long data type, src and dst hstride
> > + must be
> > aligned to the same qword
> > + SIOF_OP_MOV_LONG_REG_RESTRICT = 1 << 1,
> > };
> >
> > /*! Owns the selection engine */
> > diff --git a/backend/src/backend/gen_insn_selection_optimize.cpp
> > b/backend/src/backend/gen_insn_selection_optimize.cpp
> > index 939dd18..fffc8b0 100644
> > --- a/backend/src/backend/gen_insn_selection_optimize.cpp
> > +++ b/backend/src/backend/gen_insn_selection_optimize.cpp
> > @@ -30,7 +30,7 @@ namespace gbe
> > elements |= (1 << offsetInType);
> Is 32bits for elements enough? For example simd16 and hstride is 4.
>
> > offsetInByte += hstride * elementSize;
> > }
> > - offsetInByte += vstride * elementSize;
> > + base += vstride * elementSize;
> > }
> > return elements;
> > }
> > @@ -174,6 +174,12 @@ namespace gbe
> > if (insn.opcode == SEL_OP_AND && (info->replacement.absolute ||
> > info-
> > >replacement.negation))
> > return false;
> >
> > + if (features & SIOF_OP_MOV_LONG_REG_RESTRICT && insn.opcode ==
> > SEL_OP_MOV) {
> > + const GenRegister& dst = insn.dst(0);
> > + if (dst.isint64() && !info->replacement.isint64() &&
> > + info->elements !=
> > CalculateElements(info->replacement, insn.state.execWidth))
> > + return false;
> > + }
> > +
> > if (info->replacementOverwritten)
> > return false;
> >
> > --
> > 1.9.1
> >
> > _______________________________________________
> > Beignet mailing list
> > Beignet at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/beignet
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet
More information about the Beignet
mailing list