[Beignet] [PATCH 2/3] support 64bit-integer AND(&), OR(|), XOR(^) arithmetic

Xing, Homer homer.xing at intel.com
Tue Aug 6 17:26:54 PDT 2013


Bspec says, nibControl + quarterControl can be used when the real execSize is four. For example, when execSize is eight and stride is two, the real execSize is four.

Concentrating different function calls into one place is difficult. But I can change the four times repeated code inside each "case SEL_OP_XX" into a "for-loop".

From: zhigang gong [mailto:zhigang.gong at gmail.com]
Sent: Tuesday, August 6, 2013 4:38 PM
To: Xing, Homer
Cc: beignet at lists.freedesktop.org
Subject: Re: [Beignet] [PATCH 2/3] support 64bit-integer AND(&), OR(|), XOR(^) arithmetic

Sorry that, the mutt crashed and sent out an incomplete email. Here is the complete version:

Homer,

The logical of handling Long/ULong for many operations are very similar to each other. If you can concentrate
all of them into one place, then it will avoid to write duplicate code and it will reduce the maintenance effort.

And even in the same function, the logical seems repeat four times to handle each 4 elements. The different
is the register's offset and the quarterControl and nibCotnrol.

Another interesting issue is which I ignored in your first patch which is about the nibControl usage.
It seems you are using quarterControl + nibControl to handle predication for  a DW type with a
2-element-horizontal. I haven't found anything in the Gen spec which support this type of usage.
>From the spec, the only use case for nibControl is when the destination is a DF type. Right?
Any comments here?

On Tue, Aug 6, 2013 at 4:16 PM, Zhigang Gong <zhigang.gong at gmail.com<mailto:zhigang.gong at gmail.com>> wrote:
Homer,

The logical of handling Long/ULong for many operations are
very similar to each other.

handle
On Tue, Aug 06, 2013 at 04:01:30PM +0800, Homer Hsing wrote:
>
> Signed-off-by: Homer Hsing <homer.xing at intel.com<mailto:homer.xing at intel.com>>
> ---
>  backend/src/backend/gen_context.cpp        | 102 +++++++++++++++++++++++++++++
>  backend/src/backend/gen_insn_selection.cpp |  24 ++++++-
>  backend/src/backend/gen_insn_selection.hxx |   3 +
>  backend/src/ir/instruction.cpp             |   1 +
>  4 files changed, 127 insertions(+), 3 deletions(-)
>
> diff --git a/backend/src/backend/gen_context.cpp b/backend/src/backend/gen_context.cpp
> index 69dab85..bbe16d0 100644
> --- a/backend/src/backend/gen_context.cpp
> +++ b/backend/src/backend/gen_context.cpp
> @@ -162,6 +162,108 @@ namespace gbe
>        case SEL_OP_AND:  p->AND(dst, src0, src1); break;
>        case SEL_OP_OR:   p->OR (dst, src0, src1);  break;
>        case SEL_OP_XOR:  p->XOR(dst, src0, src1); break;
> +      case SEL_OP_I64AND:
> +        {
> +          GenRegister xdst = GenRegister::retype(dst, GEN_TYPE_UL),
> +                      xsrc0 = GenRegister::retype(src0, GEN_TYPE_UL),
> +                      xsrc1 = GenRegister::retype(src1, GEN_TYPE_UL);
> +          int execWidth = p->curr.execWidth;
> +          p->push();
> +          p->curr.execWidth = 8;
> +          p->AND(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +          p->AND(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +          p->curr.nibControl = 1;
> +          xdst = GenRegister::suboffset(xdst, 4),
> +          xsrc0 = GenRegister::suboffset(xsrc0, 4),
> +          xsrc1 = GenRegister::suboffset(xsrc1, 4);
> +          p->AND(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +          p->AND(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +          if (execWidth == 16) {
> +            p->curr.quarterControl = 1;
> +            p->curr.nibControl = 0;
> +            xdst = GenRegister::suboffset(xdst, 4),
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);
> +            p->AND(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +            p->AND(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +            p->curr.nibControl = 1;
> +            xdst = GenRegister::suboffset(xdst, 4),
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);
> +            p->AND(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +            p->AND(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +          }
> +          p->pop();
> +        }
> +        break;
> +      case SEL_OP_I64OR:
> +        {
> +          GenRegister xdst = GenRegister::retype(dst, GEN_TYPE_UL),
> +                      xsrc0 = GenRegister::retype(src0, GEN_TYPE_UL),
> +                      xsrc1 = GenRegister::retype(src1, GEN_TYPE_UL);
> +          int execWidth = p->curr.execWidth;
> +          p->push();
> +          p->curr.execWidth = 8;
> +          p->OR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +          p->OR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +          p->curr.nibControl = 1;
> +          xdst = GenRegister::suboffset(xdst, 4),
> +          xsrc0 = GenRegister::suboffset(xsrc0, 4),
> +          xsrc1 = GenRegister::suboffset(xsrc1, 4);
> +          p->OR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +          p->OR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +          if (execWidth == 16) {
> +            p->curr.quarterControl = 1;
> +            p->curr.nibControl = 0;
> +            xdst = GenRegister::suboffset(xdst, 4),
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);
> +            p->OR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +            p->OR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +            p->curr.nibControl = 1;
> +            xdst = GenRegister::suboffset(xdst, 4),
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);
> +            p->OR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +            p->OR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +          }
> +          p->pop();
> +        }
> +        break;
> +      case SEL_OP_I64XOR:
> +        {
> +          GenRegister xdst = GenRegister::retype(dst, GEN_TYPE_UL),
> +                      xsrc0 = GenRegister::retype(src0, GEN_TYPE_UL),
> +                      xsrc1 = GenRegister::retype(src1, GEN_TYPE_UL);
> +          int execWidth = p->curr.execWidth;
> +          p->push();
> +          p->curr.execWidth = 8;
> +          p->XOR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +          p->XOR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +          p->curr.nibControl = 1;
> +          xdst = GenRegister::suboffset(xdst, 4),
> +          xsrc0 = GenRegister::suboffset(xsrc0, 4),
> +          xsrc1 = GenRegister::suboffset(xsrc1, 4);
> +          p->XOR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +          p->XOR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +          if (execWidth == 16) {
> +            p->curr.quarterControl = 1;
> +            p->curr.nibControl = 0;
> +            xdst = GenRegister::suboffset(xdst, 4),
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);
> +            p->XOR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +            p->XOR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +            p->curr.nibControl = 1;
> +            xdst = GenRegister::suboffset(xdst, 4),
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);
> +            p->XOR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());
> +            p->XOR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());
> +          }
> +          p->pop();
> +        }
> +        break;
>        case SEL_OP_SHR:  p->SHR(dst, src0, src1); break;
>        case SEL_OP_SHL:  p->SHL(dst, src0, src1); break;
>        case SEL_OP_RSR:  p->RSR(dst, src0, src1); break;
> diff --git a/backend/src/backend/gen_insn_selection.cpp b/backend/src/backend/gen_insn_selection.cpp
> index 66cfa31..7e9402d 100644
> --- a/backend/src/backend/gen_insn_selection.cpp
> +++ b/backend/src/backend/gen_insn_selection.cpp
> @@ -423,6 +423,9 @@ namespace gbe
>      ALU2(AND)
>      ALU2(OR)
>      ALU2(XOR)
> +    ALU2(I64AND)
> +    ALU2(I64OR)
> +    ALU2(I64XOR)
>      ALU2(SHR)
>      ALU2(SHL)
>      ALU2(RSR)
> @@ -1434,9 +1437,24 @@ namespace gbe
>              sel.ADD(dst, src0, src1);
>            sel.pop();
>            break;
> -        case OP_XOR: sel.XOR(dst, src0, src1); break;
> -        case OP_OR:  sel.OR(dst, src0,  src1); break;
> -        case OP_AND: sel.AND(dst, src0, src1); break;
> +        case OP_XOR:
> +          if (type == Type::TYPE_U64 || type == Type::TYPE_S64)
> +            sel.I64XOR(dst, src0, src1);
> +          else
> +            sel.XOR(dst, src0, src1);
> +          break;
> +        case OP_OR:
> +          if (type == Type::TYPE_U64 || type == Type::TYPE_S64)
> +            sel.I64OR(dst, src0, src1);
> +          else
> +            sel.OR(dst, src0, src1);
> +          break;
> +        case OP_AND:
> +          if (type == Type::TYPE_U64 || type == Type::TYPE_S64)
> +            sel.I64AND(dst, src0, src1);
> +          else
> +            sel.AND(dst, src0, src1);
> +          break;
>          case OP_SUB:
>            if (type == Type::TYPE_U64 || type == Type::TYPE_S64) {
>              GenRegister t = sel.selReg(sel.reg(RegisterFamily::FAMILY_QWORD), Type::TYPE_S64);
> diff --git a/backend/src/backend/gen_insn_selection.hxx b/backend/src/backend/gen_insn_selection.hxx
> index 8eeb19f..7664c8f 100644
> --- a/backend/src/backend/gen_insn_selection.hxx
> +++ b/backend/src/backend/gen_insn_selection.hxx
> @@ -14,6 +14,9 @@ DECL_SELECTION_IR(SEL, BinaryInstruction)
>  DECL_SELECTION_IR(AND, BinaryInstruction)
>  DECL_SELECTION_IR(OR, BinaryInstruction)
>  DECL_SELECTION_IR(XOR, BinaryInstruction)
> +DECL_SELECTION_IR(I64AND, BinaryInstruction)
> +DECL_SELECTION_IR(I64OR, BinaryInstruction)
> +DECL_SELECTION_IR(I64XOR, BinaryInstruction)
>  DECL_SELECTION_IR(SHR, BinaryInstruction)
>  DECL_SELECTION_IR(SHL, BinaryInstruction)
>  DECL_SELECTION_IR(RSR, BinaryInstruction)
> diff --git a/backend/src/ir/instruction.cpp b/backend/src/ir/instruction.cpp
> index 2589848..f58757b 100644
> --- a/backend/src/ir/instruction.cpp
> +++ b/backend/src/ir/instruction.cpp
> @@ -672,6 +672,7 @@ namespace ir {
>      static const Type logicalType[] = {TYPE_S8,  TYPE_U8,
>                                         TYPE_S16, TYPE_U16,
>                                         TYPE_S32, TYPE_U32,
> +                                       TYPE_S64, TYPE_U64,
>                                         TYPE_BOOL};
>      static const uint32_t logicalTypeNum = ARRAY_ELEM_NUM(logicalType);
>
> --
> 1.8.1.2
>
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org<mailto:Beignet at lists.freedesktop.org>
> http://lists.freedesktop.org/mailman/listinfo/beignet

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/beignet/attachments/20130807/cb26631b/attachment-0001.html>


More information about the Beignet mailing list