<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:SimSun;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:SimSun;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri","sans-serif";}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="ZH-CN" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:"Calibri","sans-serif";color:#1F497D">Bspec says, nibControl + quarterControl can be used when the real execSize is four. For example, when execSize is eight and stride is two, the
 real execSize is four.<o:p></o:p></span></p>
<p class="MsoNormal"><a name="_MailEndCompose"><span lang="EN-US" style="font-size:10.5pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></a></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:"Calibri","sans-serif";color:#1F497D">Concentrating different function calls into one place is difficult. But I can change the four times repeated code inside each “case SEL_OP_XX”
 into a “for-loop”.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif""> zhigang gong [mailto:zhigang.gong@gmail.com]
<br>
<b>Sent:</b> Tuesday, August 6, 2013 4:38 PM<br>
<b>To:</b> Xing, Homer<br>
<b>Cc:</b> beignet@lists.freedesktop.org<br>
<b>Subject:</b> Re: [Beignet] [PATCH 2/3] support 64bit-integer AND(&), OR(|), XOR(^) arithmetic<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<div>
<p class="MsoNormal"><span lang="EN-US">Sorry that, the mutt crashed and sent out an incomplete email. Here is the complete version:<o:p></o:p></span></p>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">Homer,<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">The logical of handling Long/ULong for many operations are very similar to each other. If you can concentrate<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">all of them into one place, then it will avoid to write duplicate code and it will reduce the maintenance effort.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">And even in the same function, the logical seems repeat four times to handle each 4 elements. The different<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">is the register's offset and the quarterControl and nibCotnrol.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">Another interesting issue is which I ignored in your first patch which is about the nibControl usage.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">It seems you are using quarterControl + nibControl to handle predication for  a DW type with a <o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">2-element-horizontal. I haven't found anything in the Gen spec which support this type of usage.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">From the spec, the only use case for nibControl is when the destination is a DF type. Right?<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">Any comments here? <o:p></o:p></span></p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span lang="EN-US"><o:p> </o:p></span></p>
<div>
<p class="MsoNormal"><span lang="EN-US">On Tue, Aug 6, 2013 at 4:16 PM, Zhigang Gong <<a href="mailto:zhigang.gong@gmail.com" target="_blank">zhigang.gong@gmail.com</a>> wrote:<o:p></o:p></span></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<p class="MsoNormal"><span lang="EN-US">Homer,<br>
<br>
The logical of handling Long/ULong for many operations are<br>
very similar to each other.<br>
<br>
handle<o:p></o:p></span></p>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US">On Tue, Aug 06, 2013 at 04:01:30PM +0800, Homer Hsing wrote:<br>
><br>
> Signed-off-by: Homer Hsing <<a href="mailto:homer.xing@intel.com">homer.xing@intel.com</a>><br>
> ---<br>
>  backend/src/backend/gen_context.cpp        | 102 +++++++++++++++++++++++++++++<br>
>  backend/src/backend/gen_insn_selection.cpp |  24 ++++++-<br>
>  backend/src/backend/gen_insn_selection.hxx |   3 +<br>
>  backend/src/ir/instruction.cpp             |   1 +<br>
>  4 files changed, 127 insertions(+), 3 deletions(-)<br>
><br>
> diff --git a/backend/src/backend/gen_context.cpp b/backend/src/backend/gen_context.cpp<br>
> index 69dab85..bbe16d0 100644<br>
> --- a/backend/src/backend/gen_context.cpp<br>
> +++ b/backend/src/backend/gen_context.cpp<br>
> @@ -162,6 +162,108 @@ namespace gbe<br>
>        case SEL_OP_AND:  p->AND(dst, src0, src1); break;<br>
>        case SEL_OP_OR:   p->OR (dst, src0, src1);  break;<br>
>        case SEL_OP_XOR:  p->XOR(dst, src0, src1); break;<br>
> +      case SEL_OP_I64AND:<br>
> +        {<br>
> +          GenRegister xdst = GenRegister::retype(dst, GEN_TYPE_UL),<br>
> +                      xsrc0 = GenRegister::retype(src0, GEN_TYPE_UL),<br>
> +                      xsrc1 = GenRegister::retype(src1, GEN_TYPE_UL);<br>
> +          int execWidth = p->curr.execWidth;<br>
> +          p->push();<br>
> +          p->curr.execWidth = 8;<br>
> +          p->AND(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +          p->AND(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +          p->curr.nibControl = 1;<br>
> +          xdst = GenRegister::suboffset(xdst, 4),<br>
> +          xsrc0 = GenRegister::suboffset(xsrc0, 4),<br>
> +          xsrc1 = GenRegister::suboffset(xsrc1, 4);<br>
> +          p->AND(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +          p->AND(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +          if (execWidth == 16) {<br>
> +            p->curr.quarterControl = 1;<br>
> +            p->curr.nibControl = 0;<br>
> +            xdst = GenRegister::suboffset(xdst, 4),<br>
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),<br>
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);<br>
> +            p->AND(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +            p->AND(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +            p->curr.nibControl = 1;<br>
> +            xdst = GenRegister::suboffset(xdst, 4),<br>
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),<br>
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);<br>
> +            p->AND(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +            p->AND(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +          }<br>
> +          p->pop();<br>
> +        }<br>
> +        break;<br>
> +      case SEL_OP_I64OR:<br>
> +        {<br>
> +          GenRegister xdst = GenRegister::retype(dst, GEN_TYPE_UL),<br>
> +                      xsrc0 = GenRegister::retype(src0, GEN_TYPE_UL),<br>
> +                      xsrc1 = GenRegister::retype(src1, GEN_TYPE_UL);<br>
> +          int execWidth = p->curr.execWidth;<br>
> +          p->push();<br>
> +          p->curr.execWidth = 8;<br>
> +          p->OR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +          p->OR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +          p->curr.nibControl = 1;<br>
> +          xdst = GenRegister::suboffset(xdst, 4),<br>
> +          xsrc0 = GenRegister::suboffset(xsrc0, 4),<br>
> +          xsrc1 = GenRegister::suboffset(xsrc1, 4);<br>
> +          p->OR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +          p->OR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +          if (execWidth == 16) {<br>
> +            p->curr.quarterControl = 1;<br>
> +            p->curr.nibControl = 0;<br>
> +            xdst = GenRegister::suboffset(xdst, 4),<br>
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),<br>
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);<br>
> +            p->OR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +            p->OR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +            p->curr.nibControl = 1;<br>
> +            xdst = GenRegister::suboffset(xdst, 4),<br>
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),<br>
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);<br>
> +            p->OR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +            p->OR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +          }<br>
> +          p->pop();<br>
> +        }<br>
> +        break;<br>
> +      case SEL_OP_I64XOR:<br>
> +        {<br>
> +          GenRegister xdst = GenRegister::retype(dst, GEN_TYPE_UL),<br>
> +                      xsrc0 = GenRegister::retype(src0, GEN_TYPE_UL),<br>
> +                      xsrc1 = GenRegister::retype(src1, GEN_TYPE_UL);<br>
> +          int execWidth = p->curr.execWidth;<br>
> +          p->push();<br>
> +          p->curr.execWidth = 8;<br>
> +          p->XOR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +          p->XOR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +          p->curr.nibControl = 1;<br>
> +          xdst = GenRegister::suboffset(xdst, 4),<br>
> +          xsrc0 = GenRegister::suboffset(xsrc0, 4),<br>
> +          xsrc1 = GenRegister::suboffset(xsrc1, 4);<br>
> +          p->XOR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +          p->XOR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +          if (execWidth == 16) {<br>
> +            p->curr.quarterControl = 1;<br>
> +            p->curr.nibControl = 0;<br>
> +            xdst = GenRegister::suboffset(xdst, 4),<br>
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),<br>
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);<br>
> +            p->XOR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +            p->XOR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +            p->curr.nibControl = 1;<br>
> +            xdst = GenRegister::suboffset(xdst, 4),<br>
> +            xsrc0 = GenRegister::suboffset(xsrc0, 4),<br>
> +            xsrc1 = GenRegister::suboffset(xsrc1, 4);<br>
> +            p->XOR(xdst.bottom_half(), xsrc0.bottom_half(), xsrc1.bottom_half());<br>
> +            p->XOR(xdst.top_half(), xsrc0.top_half(), xsrc1.top_half());<br>
> +          }<br>
> +          p->pop();<br>
> +        }<br>
> +        break;<br>
>        case SEL_OP_SHR:  p->SHR(dst, src0, src1); break;<br>
>        case SEL_OP_SHL:  p->SHL(dst, src0, src1); break;<br>
>        case SEL_OP_RSR:  p->RSR(dst, src0, src1); break;<br>
> diff --git a/backend/src/backend/gen_insn_selection.cpp b/backend/src/backend/gen_insn_selection.cpp<br>
> index 66cfa31..7e9402d 100644<br>
> --- a/backend/src/backend/gen_insn_selection.cpp<br>
> +++ b/backend/src/backend/gen_insn_selection.cpp<br>
> @@ -423,6 +423,9 @@ namespace gbe<br>
>      ALU2(AND)<br>
>      ALU2(OR)<br>
>      ALU2(XOR)<br>
> +    ALU2(I64AND)<br>
> +    ALU2(I64OR)<br>
> +    ALU2(I64XOR)<br>
>      ALU2(SHR)<br>
>      ALU2(SHL)<br>
>      ALU2(RSR)<br>
> @@ -1434,9 +1437,24 @@ namespace gbe<br>
>              sel.ADD(dst, src0, src1);<br>
>            sel.pop();<br>
>            break;<br>
> -        case OP_XOR: sel.XOR(dst, src0, src1); break;<br>
> -        case OP_OR:  sel.OR(dst, src0,  src1); break;<br>
> -        case OP_AND: sel.AND(dst, src0, src1); break;<br>
> +        case OP_XOR:<br>
> +          if (type == Type::TYPE_U64 || type == Type::TYPE_S64)<br>
> +            sel.I64XOR(dst, src0, src1);<br>
> +          else<br>
> +            sel.XOR(dst, src0, src1);<br>
> +          break;<br>
> +        case OP_OR:<br>
> +          if (type == Type::TYPE_U64 || type == Type::TYPE_S64)<br>
> +            sel.I64OR(dst, src0, src1);<br>
> +          else<br>
> +            sel.OR(dst, src0, src1);<br>
> +          break;<br>
> +        case OP_AND:<br>
> +          if (type == Type::TYPE_U64 || type == Type::TYPE_S64)<br>
> +            sel.I64AND(dst, src0, src1);<br>
> +          else<br>
> +            sel.AND(dst, src0, src1);<br>
> +          break;<br>
>          case OP_SUB:<br>
>            if (type == Type::TYPE_U64 || type == Type::TYPE_S64) {<br>
>              GenRegister t = sel.selReg(sel.reg(RegisterFamily::FAMILY_QWORD), Type::TYPE_S64);<br>
> diff --git a/backend/src/backend/gen_insn_selection.hxx b/backend/src/backend/gen_insn_selection.hxx<br>
> index 8eeb19f..7664c8f 100644<br>
> --- a/backend/src/backend/gen_insn_selection.hxx<br>
> +++ b/backend/src/backend/gen_insn_selection.hxx<br>
> @@ -14,6 +14,9 @@ DECL_SELECTION_IR(SEL, BinaryInstruction)<br>
>  DECL_SELECTION_IR(AND, BinaryInstruction)<br>
>  DECL_SELECTION_IR(OR, BinaryInstruction)<br>
>  DECL_SELECTION_IR(XOR, BinaryInstruction)<br>
> +DECL_SELECTION_IR(I64AND, BinaryInstruction)<br>
> +DECL_SELECTION_IR(I64OR, BinaryInstruction)<br>
> +DECL_SELECTION_IR(I64XOR, BinaryInstruction)<br>
>  DECL_SELECTION_IR(SHR, BinaryInstruction)<br>
>  DECL_SELECTION_IR(SHL, BinaryInstruction)<br>
>  DECL_SELECTION_IR(RSR, BinaryInstruction)<br>
> diff --git a/backend/src/ir/instruction.cpp b/backend/src/ir/instruction.cpp<br>
> index 2589848..f58757b 100644<br>
> --- a/backend/src/ir/instruction.cpp<br>
> +++ b/backend/src/ir/instruction.cpp<br>
> @@ -672,6 +672,7 @@ namespace ir {<br>
>      static const Type logicalType[] = {TYPE_S8,  TYPE_U8,<br>
>                                         TYPE_S16, TYPE_U16,<br>
>                                         TYPE_S32, TYPE_U32,<br>
> +                                       TYPE_S64, TYPE_U64,<br>
>                                         TYPE_BOOL};<br>
>      static const uint32_t logicalTypeNum = ARRAY_ELEM_NUM(logicalType);<br>
><br>
> --<br>
> 1.8.1.2<br>
><br>
> _______________________________________________<br>
> Beignet mailing list<br>
> <a href="mailto:Beignet@lists.freedesktop.org">Beignet@lists.freedesktop.org</a><br>
> <a href="http://lists.freedesktop.org/mailman/listinfo/beignet" target="_blank">
http://lists.freedesktop.org/mailman/listinfo/beignet</a><o:p></o:p></span></p>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
</div>
</body>
</html>