<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
{mso-style-priority:99;
mso-style-link:"Balloon Text Char";
margin:0cm;
margin-bottom:.0001pt;
font-size:8.0pt;
font-family:"Times New Roman","serif";}
span.hoenzb
{mso-style-name:hoenzb;}
span.BalloonTextChar
{mso-style-name:"Balloon Text Char";
mso-style-priority:99;
mso-style-link:"Balloon Text";
font-family:"Times New Roman","serif";}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="ZH-CN" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:"Calibri","sans-serif";color:#1F497D">Yes, reproduced. Maybe previously I only tried 70*32.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:"Calibri","sans-serif";color:#1F497D">compiler_box_blur seems random failures on my side.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:"Calibri","sans-serif";color:#1F497D">compiler_box_blur_float seems always fails.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:"Calibri","sans-serif";color:#1F497D">I will check it now.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.5pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> beignet-bounces+ruiling.song=intel.com@lists.freedesktop.org
[mailto:beignet-bounces+ruiling.song=intel.com@lists.freedesktop.org] <b>On Behalf Of
</b>zhigang gong<br>
<b>Sent:</b> Thursday, August 08, 2013 7:46 AM<br>
<b>To:</b> Song, Ruiling<br>
<b>Cc:</b> beignet@lists.freedesktop.org<br>
<b>Subject:</b> Re: [Beignet] [PATCH V2 2/2] Implement spill/unspill<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US">Ruiling,<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<p class="MsoNormal"><span lang="EN-US">I just ran the utests with 4KB - 80*32 register size. And got the following failures:<o:p></o:p></span></p>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US">compiler_box_blur_float:<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"> compiler_box_blur_float() [FAILED]<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"> Error: image mismatch<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"> at file /home/gongzg/git/fdo/beignet/utests/compiler_box_blur_float.cpp, function compiler_box_blur_float, line 60<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">compiler_box_blur:<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"> compiler_box_blur() [FAILED]<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"> Error: image mismatch<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"> at file /home/gongzg/git/fdo/beignet/utests/compiler_box_blur.cpp, function compiler_box_blur, line 39<o:p></o:p></span></p>
</div>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">And as 64bit data type is not supported, I can't finish all the unit test cases. There maybe some other failures not triggered..<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US">Could you reproduce the same result on your environment?<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span lang="EN-US"><o:p> </o:p></span></p>
<div>
<p class="MsoNormal"><span lang="EN-US">On Wed, Aug 7, 2013 at 3:53 PM, Song, Ruiling <<a href="mailto:ruiling.song@intel.com" target="_blank">ruiling.song@intel.com</a>> wrote:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Yes, normally we need complex kernel to trigger spill/unspill. I have no idea of writing a test case to cover this.<br>
But it is easy to test spill logic: change below line in backend/context.cpp<br>
static const int16_t RegisterFileSize = 4*KB;<br>
to something like:<br>
static const int16_t RegisterFileSize = 4*KB - 80*32;<br>
then run all the unit tests. I have not tried to spill all. As there are curbe entries. And you know there is still some limitations.<br>
I will try some more aggressive value to spill more regs.<br>
<br>
Thanks!<br>
<span class="hoenzb"><span style="color:#888888">Ruiling</span></span><o:p></o:p></span></p>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US">-----Original Message-----<br>
From: beignet-bounces+ruiling.song=<a href="mailto:intel.com@lists.freedesktop.org">intel.com@lists.freedesktop.org</a> [mailto:<a href="mailto:beignet-bounces%2Bruiling.song">beignet-bounces+ruiling.song</a>=<a href="mailto:intel.com@lists.freedesktop.org">intel.com@lists.freedesktop.org</a>]
On Behalf Of Zhigang Gong<br>
Sent: Wednesday, August 07, 2013 3:32 PM<br>
To: Song, Ruiling<br>
Cc: <a href="mailto:beignet@lists.freedesktop.org">beignet@lists.freedesktop.org</a><br>
Subject: Re: [Beignet] [PATCH V2 2/2] Implement spill/unspill<br>
<br>
This version LGTM, will push it latter. Really nice work to bring spill/unspill to Beinet, thanks.<br>
<br>
Could you add some unit tests here to test both the spill/unspill and the scratch read/write?<br>
<br>
You know we do need solid test case for spill/unspill. If there is some bugs hiding here, it will be extremly hard to find it when we met them in a real complicate kernel.<br>
<br>
On Wed, Aug 07, 2013 at 03:15:50PM +0800, Ruiling Song wrote:<br>
> The current implementation works like below:<br>
> I reserve a pool of registers for spill/reload. Currently 6 registers<br>
> are reserved to handle SelectionVector with at most 5 elements.<br>
> The other one is used as scratch message header register. The register<br>
> after header register was used as the payload for scratch write.<br>
><br>
> To do spill, just iterate the instructions. If the virtual register<br>
> was used as src, insert reload instruction before it. If the virtual<br>
> register was used as dst, insert spill instruction to write the<br>
> register content to scratch memory.<br>
><br>
> Limitations yet:<br>
> 64bit not support.<br>
> SelectionVector > 5 not handled.<br>
><br>
> Signed-off-by: Ruiling Song <<a href="mailto:ruiling.song@intel.com">ruiling.song@intel.com</a>><br>
> ---<br>
> backend/src/backend/gen_context.cpp | 34 ++++++++++++<br>
> backend/src/backend/gen_context.hpp | 2 +<br>
> .../src/backend/gen_insn_gen7_schedule_info.hxx | 2 +<br>
> backend/src/backend/gen_insn_scheduling.cpp | 39 +++++++++-----<br>
> backend/src/backend/gen_insn_selection.cpp | 57 +++++++++++++++++++-<br>
> backend/src/backend/gen_insn_selection.hpp | 6 +++<br>
> backend/src/backend/gen_insn_selection.hxx | 2 +<br>
> backend/src/backend/gen_reg_allocation.cpp | 56 +++++++++++++++++--<br>
> 8 files changed, 180 insertions(+), 18 deletions(-)<br>
><br>
> diff --git a/backend/src/backend/gen_context.cpp<br>
> b/backend/src/backend/gen_context.cpp<br>
> index 29fa1c3..ce83923 100644<br>
> --- a/backend/src/backend/gen_context.cpp<br>
> +++ b/backend/src/backend/gen_context.cpp<br>
> @@ -542,6 +542,40 @@ namespace gbe<br>
> p->pop();<br>
> }<br>
><br>
> + void GenContext::emitSpillRegInstruction(const SelectionInstruction &insn) {<br>
> + uint32_t simdWidth = p->curr.execWidth;<br>
> + uint32_t scratchOffset = insn.extra.scratchOffset;<br>
> + const uint32_t header = insn.extra.scratchMsgHeader;<br>
> + p->push();<br>
> +<br>
> + const GenRegister msg = GenRegister::ud8grf(header, 0);<br>
> + const GenRegister src = ra->genReg(insn.src(0));<br>
> + GenRegister payload = src;<br>
> + <a href="http://payload.nr" target="_blank">payload.nr</a> = header + 1;<br>
> + payload.subnr = 0;<br>
> +<br>
> + p->MOV(payload, src);<br>
> + uint32_t regType = insn.src(0).type;<br>
> + uint32_t size = typeSize(regType);<br>
> + assert(size <= 4);<br>
> + uint32_t regNum = (stride(src.hstride)*size*simdWidth) > 32 ? 2 : 1;<br>
> + this->scratchWrite(msg, scratchOffset, regNum, regType, GEN_SCRATCH_CHANNEL_MODE_DWORD);<br>
> + p->pop();<br>
> + }<br>
> +<br>
> + void GenContext::emitUnSpillRegInstruction(const SelectionInstruction &insn) {<br>
> + uint32_t scratchOffset = insn.extra.scratchOffset;<br>
> + const GenRegister dst = insn.dst(0);<br>
> + uint32_t regType = dst.type;<br>
> + uint32_t simdWidth = p->curr.execWidth;<br>
> + const uint32_t header = insn.extra.scratchMsgHeader;<br>
> + uint32_t size = typeSize(regType);<br>
> + assert(size <= 4);<br>
> + uint32_t regNum = (stride(dst.hstride)*size*simdWidth) > 32 ? 2 : 1;<br>
> + const GenRegister msg = GenRegister::ud8grf(header, 0);<br>
> + this->scratchRead(GenRegister::retype(dst, GEN_TYPE_UD), msg,<br>
> + scratchOffset, regNum, regType, GEN_SCRATCH_CHANNEL_MODE_DWORD); }<br>
> +<o:p></o:p></span></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US">> // For SIMD8, we allocate 2*elemNum temporary registers from dst(0), and<br>
> // then follow the real destination registers.<br>
> // For SIMD16, we allocate elemNum temporary registers from dst(0).<br>
> diff --git a/backend/src/backend/gen_context.hpp<br>
> b/backend/src/backend/gen_context.hpp<br>
> index bcf0dc4..694ae98 100644<br>
> --- a/backend/src/backend/gen_context.hpp<br>
> +++ b/backend/src/backend/gen_context.hpp<br>
> @@ -108,6 +108,8 @@ namespace gbe<br>
> void emitByteScatterInstruction(const SelectionInstruction &insn);<br>
> void emitSampleInstruction(const SelectionInstruction &insn);<br>
> void emitTypedWriteInstruction(const SelectionInstruction &insn);<br>
> + void emitSpillRegInstruction(const SelectionInstruction &insn);<br>
> + void emitUnSpillRegInstruction(const SelectionInstruction &insn);<br>
> void emitGetImageInfoInstruction(const SelectionInstruction &insn);<br>
> void scratchWrite(const GenRegister header, uint32_t offset, uint32_t reg_num, uint32_t reg_type, uint32_t channel_mode);<br>
> void scratchRead(const GenRegister dst, const GenRegister header,<br>
> uint32_t offset, uint32_t reg_num, uint32_t reg_type, uint32_t<br>
> channel_mode); diff --git<br>
> a/backend/src/backend/gen_insn_gen7_schedule_info.hxx<br>
> b/backend/src/backend/gen_insn_gen7_schedule_info.hxx<br>
> index 6f37c3d..da8f2a2 100644<br>
> --- a/backend/src/backend/gen_insn_gen7_schedule_info.hxx<br>
> +++ b/backend/src/backend/gen_insn_gen7_schedule_info.hxx<br>
> @@ -20,5 +20,7 @@ DECL_GEN7_SCHEDULE(ByteGather, 80, 1, 1)<br>
> DECL_GEN7_SCHEDULE(ByteScatter, 80, 1, 1)<br>
> DECL_GEN7_SCHEDULE(Sample, 80, 1, 1)<br>
> DECL_GEN7_SCHEDULE(TypedWrite, 80, 1, 1)<br>
> +DECL_GEN7_SCHEDULE(SpillReg, 80, 1, 1)<br>
> +DECL_GEN7_SCHEDULE(UnSpillReg, 80, 1, 1)<br>
> DECL_GEN7_SCHEDULE(GetImageInfo, 20, 4, 2)<br>
> DECL_GEN7_SCHEDULE(Atomic, 80, 1, 1)<br>
> diff --git a/backend/src/backend/gen_insn_scheduling.cpp<br>
> b/backend/src/backend/gen_insn_scheduling.cpp<br>
> index cb990be..0b720b7 100644<br>
> --- a/backend/src/backend/gen_insn_scheduling.cpp<br>
> +++ b/backend/src/backend/gen_insn_scheduling.cpp<br>
> @@ -283,19 +283,24 @@ namespace gbe<br>
> uint32_t DependencyTracker::getIndex(GenRegister reg) const {<br>
> // Non GRF physical register<br>
> if (reg.physical) {<br>
> - GBE_ASSERT (reg.file == GEN_ARCHITECTURE_REGISTER_FILE);<br>
> - const uint32_t file = <a href="http://reg.nr" target="_blank">reg.nr</a> & 0xf0;<br>
> - const uint32_t nr = <a href="http://reg.nr" target="_blank">reg.nr</a> & 0x0f;<br>
> - if (file == GEN_ARF_FLAG) {<br>
> - const uint32_t subnr = reg.subnr / sizeof(uint16_t);<br>
> - GBE_ASSERT(nr < MAX_FLAG_REGISTER && (subnr == 0 || subnr == 1));<br>
> - return grfNum + 2*nr + subnr;<br>
> - } else if (file == GEN_ARF_ACCUMULATOR) {<br>
> - GBE_ASSERT(nr < MAX_ACC_REGISTER);<br>
> - return grfNum + MAX_FLAG_REGISTER + nr;<br>
> + //GBE_ASSERT (reg.file == GEN_ARCHITECTURE_REGISTER_FILE);<br>
> + if(reg.file == GEN_ARCHITECTURE_REGISTER_FILE) {<br>
> + const uint32_t file = <a href="http://reg.nr" target="_blank">reg.nr</a> & 0xf0;<br>
> + const uint32_t nr = <a href="http://reg.nr" target="_blank">reg.nr</a> & 0x0f;<br>
> + if (file == GEN_ARF_FLAG) {<br>
> + const uint32_t subnr = reg.subnr / sizeof(uint16_t);<br>
> + GBE_ASSERT(nr < MAX_FLAG_REGISTER && (subnr == 0 || subnr == 1));<br>
> + return grfNum + 2*nr + subnr;<br>
> + } else if (file == GEN_ARF_ACCUMULATOR) {<br>
> + GBE_ASSERT(nr < MAX_ACC_REGISTER);<br>
> + return grfNum + MAX_FLAG_REGISTER + nr;<br>
> + } else {<br>
> + NOT_SUPPORTED;<br>
> + return 0;<br>
> + }<br>
> } else {<br>
> - NOT_SUPPORTED;<br>
> - return 0;<br>
> + const uint32_t simdWidth = scheduler.ctx.getSimdWidth();<br>
> + return simdWidth == 8 ? <a href="http://reg.nr" target="_blank">reg.nr</a> :
<a href="http://reg.nr" target="_blank">reg.nr</a> / 2;<br>
> }<br>
> }<br>
> // We directly manipulate physical GRFs here @@ -344,6 +349,10 @@<br>
> namespace gbe<br>
> this->nodes[index] = node;<br>
> }<br>
><br>
> + if(insn.opcode == SEL_OP_SPILL_REG) {<br>
> + const uint32_t index = this->getIndex(0xff);<br>
> + this->nodes[index] = node;<br>
> + }<br>
> // Consider barriers and wait write to memory<br>
> if (insn.opcode == SEL_OP_BARRIER ||<br>
> insn.opcode == SEL_OP_FENCE || @@ -424,6 +433,11 @@ namespace<br>
> gbe<br>
> const uint32_t index = tracker.getIndex(insn.extra.function);<br>
> tracker.addDependency(node, index);<br>
> }<br>
> + //read-after-write of scratch memory<br>
> + if (insn.opcode == SEL_OP_UNSPILL_REG) {<br>
> + const uint32_t index = tracker.getIndex(0xff);<br>
> + tracker.addDependency(node, index);<br>
> + }<br>
><br>
> // Consider barriers and wait are reading memory (local and global)<br>
> if (insn.opcode == SEL_OP_BARRIER || @@ -453,6 +467,7 @@<br>
> namespace gbe<br>
> tracker.addDependency(node, index);<br>
> }<br>
><br>
> +<br>
> // Consider barriers and wait are writing memory (local and global)<br>
> if (insn.opcode == SEL_OP_BARRIER ||<br>
> insn.opcode == SEL_OP_FENCE || diff --git<br>
> a/backend/src/backend/gen_insn_selection.cpp<br>
> b/backend/src/backend/gen_insn_selection.cpp<br>
> index 7e9402d..3610051 100644<br>
> --- a/backend/src/backend/gen_insn_selection.cpp<br>
> +++ b/backend/src/backend/gen_insn_selection.cpp<br>
> @@ -315,6 +315,8 @@ namespace gbe<br>
> INLINE ir::Register replaceSrc(SelectionInstruction *insn, uint32_t regID);<br>
> /*! Implement public class */<br>
> INLINE ir::Register replaceDst(SelectionInstruction *insn,<br>
> uint32_t regID);<br>
> + /*! spill a register (insert spill/unspill instructions) */<br>
> + INLINE void spillReg(ir::Register reg, uint32_t registerPool);<br>
> /*! Implement public class */<br>
> INLINE uint32_t getRegNum(void) const { return file.regNum(); }<br>
> /*! Implements public interface */ @@ -617,6 +619,57 @@ namespace<br>
> gbe<br>
> return vector;<br>
> }<br>
><br>
> + void Selection::Opaque::spillReg(ir::Register spilledReg, uint32_t registerPool) {<br>
> + assert(registerPool != 0);<br>
> + const uint32_t simdWidth = ctx.getSimdWidth();<br>
> + const uint32_t dstStart = registerPool + 1;<br>
> + const uint32_t srcStart = registerPool + 1;<br>
> + uint32_t ptr =<br>
> + ctx.allocateScratchMem(typeSize(GEN_TYPE_D)*simdWidth);<br>
> +<br>
> + for (auto &block : blockList)<br>
> + for (auto &insn : block.insnList) {<br>
> + const uint32_t srcNum = insn.srcNum, dstNum = insn.dstNum;<br>
> +<br>
> + for (uint32_t srcID = 0; srcID < srcNum; ++srcID) {<br>
> + const GenRegister selReg = insn.src(srcID);<br>
> + const ir::Register reg = selReg.reg();<br>
> + if(selReg.file == GEN_GENERAL_REGISTER_FILE && reg == spilledReg) {<br>
> + GBE_ASSERT(srcID < 5);<br>
> + SelectionInstruction *unspill = this->create(SEL_OP_UNSPILL_REG, 1, 0);<br>
> + unspill->state = GenInstructionState(simdWidth);<br>
> + unspill->dst(0) = GenRegister(GEN_GENERAL_REGISTER_FILE, srcStart+srcID, 0,<br>
> + selReg.type, selReg.vstride, selReg.width, selReg.hstride);<br>
> + GenRegister src = insn.src(srcID);<br>
> + // change nr/subnr, keep other register settings<br>
> + <a href="http://src.nr" target="_blank">src.nr</a> = srcStart+srcID; src.subnr=0; src.physical=1;<br>
> + insn.src(srcID) = src;<br>
> + unspill->extra.scratchOffset = ptr;<br>
> + unspill->extra.scratchMsgHeader = registerPool;<br>
> + insn.prepend(*unspill);<br>
> + }<br>
> + }<br>
> +<br>
> + for (uint32_t dstID = 0; dstID < dstNum; ++dstID) {<br>
> + const GenRegister selReg = insn.dst(dstID);<br>
> + const ir::Register reg = selReg.reg();<br>
> + if(selReg.file == GEN_GENERAL_REGISTER_FILE && reg == spilledReg) {<br>
> + GBE_ASSERT(dstID < 5);<br>
> + SelectionInstruction *spill = this->create(SEL_OP_SPILL_REG, 0, 1);<br>
> + spill->state = GenInstructionState(simdWidth);<br>
> + spill->src(0) =GenRegister(GEN_GENERAL_REGISTER_FILE, dstStart + dstID, 0,<br>
> + selReg.type, selReg.vstride, selReg.width, selReg.hstride);<br>
> + GenRegister dst = insn.dst(dstID);<br>
> + // change nr/subnr, keep other register settings<br>
> + dst.physical =1; <a href="http://dst.nr" target="_blank">dst.nr</a> = dstStart+dstID; dst.subnr = 0;<br>
> + insn.dst(dstID)= dst;<br>
> + spill->extra.scratchOffset = ptr;<br>
> + spill->extra.scratchMsgHeader = registerPool;<br>
> + insn.append(*spill);<br>
> + }<br>
> + }<br>
> + }<br>
> + }<br>
> +<br>
> ir::Register Selection::Opaque::replaceSrc(SelectionInstruction *insn, uint32_t regID) {<br>
> SelectionBlock *block = insn->parent;<br>
> const uint32_t simdWidth = ctx.getSimdWidth(); @@ -820,7 +873,6<br>
> @@ namespace gbe<br>
> dstVector->regNum = elemNum;<br>
> dstVector->isSrc = 0;<br>
> dstVector->reg = &insn->dst(0);<br>
> -<br>
> // Source cannot be scalar (yet)<br>
> srcVector->regNum = 1;<br>
> srcVector->isSrc = 1;<br>
> @@ -1188,6 +1240,9 @@ namespace gbe<br>
> ir::Register Selection::replaceDst(SelectionInstruction *insn, uint32_t regID) {<br>
> return this->opaque->replaceDst(insn, regID);<br>
> }<br>
> + void Selection::spillReg(ir::Register reg, uint32_t registerPool) {<br>
> + this->opaque->spillReg(reg, registerPool); }<br>
><o:p></o:p></span></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US">> SelectionInstruction *Selection::create(SelectionOpcode opcode, uint32_t dstNum, uint32_t srcNum) {<br>
> return this->opaque->create(opcode, dstNum, srcNum); diff --git<br>
> a/backend/src/backend/gen_insn_selection.hpp<br>
> b/backend/src/backend/gen_insn_selection.hpp<br>
> index 5ae6e42..79b73e2 100644<br>
> --- a/backend/src/backend/gen_insn_selection.hpp<br>
> +++ b/backend/src/backend/gen_insn_selection.hpp<br>
> @@ -107,6 +107,10 @@ namespace gbe<br>
> /*! offset (0 to 7) */<br>
> uint16_t offset:5;<br>
> };<br>
> + struct {<br>
> + uint16_t scratchOffset;<br>
> + uint16_t scratchMsgHeader;<br>
> + };<br>
> } extra;<br>
> /*! Gen opcode */<br>
> uint8_t opcode;<br>
> @@ -197,6 +201,8 @@ namespace gbe<br>
> ir::Register replaceSrc(SelectionInstruction *insn, uint32_t regID);<br>
> /*! Replace a destination to the returned temporary register */<br>
> ir::Register replaceDst(SelectionInstruction *insn, uint32_t<br>
> regID);<br>
> + /*! spill a register (insert spill/unspill instructions) */<br>
> + void spillReg(ir::Register reg, uint32_t registerPool);<br>
> /*! Create a new selection instruction */<br>
> SelectionInstruction *create(SelectionOpcode, uint32_t dstNum, uint32_t srcNum);<br>
> /*! List of emitted blocks */<br>
> diff --git a/backend/src/backend/gen_insn_selection.hxx<br>
> b/backend/src/backend/gen_insn_selection.hxx<br>
> index 7664c8f..495978f 100644<br>
> --- a/backend/src/backend/gen_insn_selection.hxx<br>
> +++ b/backend/src/backend/gen_insn_selection.hxx<br>
> @@ -48,6 +48,8 @@ DECL_SELECTION_IR(BYTE_SCATTER,<br>
> ByteScatterInstruction) DECL_SELECTION_IR(SAMPLE, SampleInstruction)<br>
> DECL_SELECTION_IR(TYPED_WRITE, TypedWriteInstruction)<br>
> DECL_SELECTION_IR(GET_IMAGE_INFO, GetImageInfoInstruction)<br>
> +DECL_SELECTION_IR(SPILL_REG, SpillRegInstruction)<br>
> +DECL_SELECTION_IR(UNSPILL_REG, UnSpillRegInstruction)<br>
> DECL_SELECTION_IR(MUL_HI, TernaryInstruction) DECL_SELECTION_IR(FBH,<br>
> UnaryInstruction) DECL_SELECTION_IR(FBL, UnaryInstruction) diff --git<br>
> a/backend/src/backend/gen_reg_allocation.cpp<br>
> b/backend/src/backend/gen_reg_allocation.cpp<br>
> index 4ba03ea..ccbc0da 100644<br>
> --- a/backend/src/backend/gen_reg_allocation.cpp<br>
> +++ b/backend/src/backend/gen_reg_allocation.cpp<br>
> @@ -31,6 +31,8 @@<br>
> #include <algorithm><br>
> #include <climits><br>
><br>
> +#define RESERVED_REG_NUM_FOR_SPILL 6<br>
> +<br>
> namespace gbe<br>
> {<br>
><br>
> //////////////////////////////////////////////////////////////////////<br>
> ///////<br>
> @@ -94,6 +96,10 @@ namespace gbe<br>
> vector<GenRegInterval*> starting;<br>
> /*! Intervals sorting based on ending point positions */<br>
> vector<GenRegInterval*> ending;<br>
> + /*! registers that are spilled */<br>
> + set<ir::Register> spilled;<br>
> + /* reserved registers for register spill/reload */<br>
> + uint32_t reservedReg;<br>
> /*! Current vector to expire */<br>
> uint32_t expiringID;<br>
> /*! Use custom allocator */<br>
> @@ -259,6 +265,11 @@ namespace gbe<br>
> continue;<br>
> }<br>
><br>
> + //ignore register that already spilled<br>
> + if(spilled.contains(reg)) {<br>
> + this->expiringID++;<br>
> + continue;<br>
> + }<br>
> // Ignore booleans that were allocated with flags<br>
> // if (ctx.getRegisterFamily(reg) == ir::FAMILY_BOOL && !grfBooleans.contains(reg)) {<br>
> if (ctx.sel->getRegisterFamily(reg) == ir::FAMILY_BOOL) { @@<br>
> -473,6 +484,9 @@ namespace gbe<br>
> auto it = vectorMap.find(reg);<br>
> if (it != vectorMap.end()) {<br>
> const SelectionVector *vector = it->second.first;<br>
> + // all the reg in the SelectionVector are spilled<br>
> + if(spilled.contains(vector->reg[0].reg()))<br>
> + continue;<br>
> const uint32_t simdWidth = ctx.getSimdWidth();<br>
><br>
> const ir::RegisterData regData =<br>
> ctx.sel->getRegisterData(reg); @@ -481,10 +495,28 @@ namespace gbe<br>
> const uint32_t alignment = simdWidth*typeSize;<br>
><br>
> const uint32_t size = vector->regNum * alignment;<br>
> +<br>
> uint32_t grfOffset;<br>
> while ((grfOffset = ctx.allocate(size, alignment)) == 0) {<br>
> const bool success = this->expireGRF(interval);<br>
> - if (success == false) return false;<br>
> + if (success == false) {<br>
> + // if no spill support, just return false, else simply spill the register<br>
> + if(reservedReg == 0) return false;<br>
> + break;<br>
> + }<br>
> + }<br>
> + if(grfOffset == 0) {<br>
> + // spill all the registers in the SelectionVector<br>
> + // the tricky here is I need to use reservedReg+1 as scratch write payload.<br>
> + // so, i need to write the first register to scratch memory first.<br>
> + // the spillReg() will just append scratch write insn after the def. To spill<br>
> + // the first register, need to call spillReg() last for the vector->reg[0]<br>
> + GBE_ASSERT(vector->regNum < RESERVED_REG_NUM_FOR_SPILL);<br>
> + for(int i = vector->regNum-1; i >= 0; i--) {<br>
> + spilled.insert(vector->reg[i].reg());<br>
> + selection.spillReg(vector->reg[i].reg(), reservedReg);<br>
> + }<br>
> + continue;<br>
> }<br>
> for (uint32_t regID = 0; regID < vector->regNum; ++regID, grfOffset += alignment) {<br>
> const ir::Register reg = vector->reg[regID].reg(); @@<br>
> -494,18 +526,25 @@ namespace gbe<br>
> }<br>
> }<br>
> // Case 2: This is a regular scalar register, allocate it alone<br>
> - else if (this->createGenReg(interval) == false)<br>
> - return false;<br>
> + else if (this->createGenReg(interval) == false) {<br>
> + if(reservedReg == 0) return false;<br>
> + spilled.insert(reg);<br>
> + selection.spillReg(reg, reservedReg);<br>
> + }<br>
> }<br>
> return true;<br>
> }<br>
> -<br>
> INLINE bool GenRegAllocator::Opaque::allocate(Selection &selection) {<br>
> using namespace ir;<br>
> const Kernel *kernel = ctx.getKernel();<br>
> const Function &fn = ctx.getFunction();<br>
> GBE_ASSERT(fn.getProfile() == PROFILE_OCL);<br>
> -<br>
> + if (ctx.getSimdWidth() == 8) {<br>
> + reservedReg = ctx.allocate(RESERVED_REG_NUM_FOR_SPILL * GEN_REG_SIZE, GEN_REG_SIZE);<br>
> + reservedReg /= GEN_REG_SIZE;<br>
> + } else {<br>
> + reservedReg = 0;<br>
> + }<br>
> // Allocate all the vectors first since they need to be contiguous<br>
> this->allocateVector(selection);<br>
> // schedulePreRegAllocation(ctx, selection); @@ -690,6 +729,10 @@<br>
> namespace gbe<br>
> int subreg = offst % 8;<br>
> std::cout << "%" << vReg << " g" << reg << "." << subreg << "D" << std::endl;<br>
> }<br>
> + std::set<ir::Register>::iterator is;<br>
> + std::cout << "## spilled registers:" << std::endl;<br>
> + for(is = spilled.begin(); is != spilled.end(); is++)<br>
> + std::cout << (int)*is << std::endl;<br>
> std::cout << std::endl;<br>
> }<br>
><br>
> @@ -704,6 +747,9 @@ namespace gbe<br>
><br>
> INLINE GenRegister GenRegAllocator::Opaque::genReg(const GenRegister ®) {<br>
> if (reg.file == GEN_GENERAL_REGISTER_FILE) {<br>
> + if(reg.physical == 1) {<br>
> + return reg;<br>
> + }<br>
> GBE_ASSERT(RA.contains(reg.reg()) != false);<br>
> const uint32_t grfOffset = RA.find(reg.reg())->second;<br>
> const GenRegister dst = setGenReg(reg, grfOffset);<br>
> --<br>
> 1.7.9.5<br>
><br>
> _______________________________________________<br>
> Beignet mailing list<br>
> <a href="mailto:Beignet@lists.freedesktop.org">Beignet@lists.freedesktop.org</a><br>
> <a href="http://lists.freedesktop.org/mailman/listinfo/beignet" target="_blank">
http://lists.freedesktop.org/mailman/listinfo/beignet</a><br>
_______________________________________________<br>
Beignet mailing list<br>
<a href="mailto:Beignet@lists.freedesktop.org">Beignet@lists.freedesktop.org</a><br>
<a href="http://lists.freedesktop.org/mailman/listinfo/beignet" target="_blank">http://lists.freedesktop.org/mailman/listinfo/beignet</a><o:p></o:p></span></p>
</div>
</div>
</div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
</div>
</body>
</html>