<div dir="ltr"><div style>Ruiling,</div><div style><br></div>I just ran the utests with 4KB - 80*32 register size. And got the following failures:<div><div>compiler_box_blur_float:</div><div> compiler_box_blur_float() [FAILED]</div>
<div> Error: image mismatch</div>
<div> at file /home/gongzg/git/fdo/beignet/utests/compiler_box_blur_float.cpp, function compiler_box_blur_float, line 60</div><div><br></div><div>compiler_box_blur:</div><div> compiler_box_blur() [FAILED]</div><div>
Error: image mismatch</div>
<div> at file /home/gongzg/git/fdo/beignet/utests/compiler_box_blur.cpp, function compiler_box_blur, line 39</div></div><div><br></div><div>And as 64bit data type is not supported, I can't finish all the unit test cases. There maybe some other failures not triggered..</div>
<div style>Could you reproduce the same result on your environment?</div><div style><br></div>
<div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Aug 7, 2013 at 3:53 PM, Song, Ruiling <span dir="ltr"><<a href="mailto:ruiling.song@intel.com" target="_blank">ruiling.song@intel.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Yes, normally we need complex kernel to trigger spill/unspill. I have no idea of writing a test case to cover this.<br>
But it is easy to test spill logic: change below line in backend/context.cpp<br>
static const int16_t RegisterFileSize = 4*KB;<br>
to something like:<br>
static const int16_t RegisterFileSize = 4*KB - 80*32;<br>
then run all the unit tests. I have not tried to spill all. As there are curbe entries. And you know there is still some limitations.<br>
I will try some more aggressive value to spill more regs.<br>
<br>
Thanks!<br>
<span class="HOEnZb"><font color="#888888">Ruiling<br>
</font></span><div class="HOEnZb"><div class="h5">-----Original Message-----<br>
From: beignet-bounces+ruiling.song=<a href="mailto:intel.com@lists.freedesktop.org">intel.com@lists.freedesktop.org</a> [mailto:<a href="mailto:beignet-bounces%2Bruiling.song">beignet-bounces+ruiling.song</a>=<a href="mailto:intel.com@lists.freedesktop.org">intel.com@lists.freedesktop.org</a>] On Behalf Of Zhigang Gong<br>
Sent: Wednesday, August 07, 2013 3:32 PM<br>
To: Song, Ruiling<br>
Cc: <a href="mailto:beignet@lists.freedesktop.org">beignet@lists.freedesktop.org</a><br>
Subject: Re: [Beignet] [PATCH V2 2/2] Implement spill/unspill<br>
<br>
This version LGTM, will push it latter. Really nice work to bring spill/unspill to Beinet, thanks.<br>
<br>
Could you add some unit tests here to test both the spill/unspill and the scratch read/write?<br>
<br>
You know we do need solid test case for spill/unspill. If there is some bugs hiding here, it will be extremly hard to find it when we met them in a real complicate kernel.<br>
<br>
On Wed, Aug 07, 2013 at 03:15:50PM +0800, Ruiling Song wrote:<br>
> The current implementation works like below:<br>
> I reserve a pool of registers for spill/reload. Currently 6 registers<br>
> are reserved to handle SelectionVector with at most 5 elements.<br>
> The other one is used as scratch message header register. The register<br>
> after header register was used as the payload for scratch write.<br>
><br>
> To do spill, just iterate the instructions. If the virtual register<br>
> was used as src, insert reload instruction before it. If the virtual<br>
> register was used as dst, insert spill instruction to write the<br>
> register content to scratch memory.<br>
><br>
> Limitations yet:<br>
> 64bit not support.<br>
> SelectionVector > 5 not handled.<br>
><br>
> Signed-off-by: Ruiling Song <<a href="mailto:ruiling.song@intel.com">ruiling.song@intel.com</a>><br>
> ---<br>
> backend/src/backend/gen_context.cpp | 34 ++++++++++++<br>
> backend/src/backend/gen_context.hpp | 2 +<br>
> .../src/backend/gen_insn_gen7_schedule_info.hxx | 2 +<br>
> backend/src/backend/gen_insn_scheduling.cpp | 39 +++++++++-----<br>
> backend/src/backend/gen_insn_selection.cpp | 57 +++++++++++++++++++-<br>
> backend/src/backend/gen_insn_selection.hpp | 6 +++<br>
> backend/src/backend/gen_insn_selection.hxx | 2 +<br>
> backend/src/backend/gen_reg_allocation.cpp | 56 +++++++++++++++++--<br>
> 8 files changed, 180 insertions(+), 18 deletions(-)<br>
><br>
> diff --git a/backend/src/backend/gen_context.cpp<br>
> b/backend/src/backend/gen_context.cpp<br>
> index 29fa1c3..ce83923 100644<br>
> --- a/backend/src/backend/gen_context.cpp<br>
> +++ b/backend/src/backend/gen_context.cpp<br>
> @@ -542,6 +542,40 @@ namespace gbe<br>
> p->pop();<br>
> }<br>
><br>
> + void GenContext::emitSpillRegInstruction(const SelectionInstruction &insn) {<br>
> + uint32_t simdWidth = p->curr.execWidth;<br>
> + uint32_t scratchOffset = insn.extra.scratchOffset;<br>
> + const uint32_t header = insn.extra.scratchMsgHeader;<br>
> + p->push();<br>
> +<br>
> + const GenRegister msg = GenRegister::ud8grf(header, 0);<br>
> + const GenRegister src = ra->genReg(insn.src(0));<br>
> + GenRegister payload = src;<br>
> + <a href="http://payload.nr" target="_blank">payload.nr</a> = header + 1;<br>
> + payload.subnr = 0;<br>
> +<br>
> + p->MOV(payload, src);<br>
> + uint32_t regType = insn.src(0).type;<br>
> + uint32_t size = typeSize(regType);<br>
> + assert(size <= 4);<br>
> + uint32_t regNum = (stride(src.hstride)*size*simdWidth) > 32 ? 2 : 1;<br>
> + this->scratchWrite(msg, scratchOffset, regNum, regType, GEN_SCRATCH_CHANNEL_MODE_DWORD);<br>
> + p->pop();<br>
> + }<br>
> +<br>
> + void GenContext::emitUnSpillRegInstruction(const SelectionInstruction &insn) {<br>
> + uint32_t scratchOffset = insn.extra.scratchOffset;<br>
> + const GenRegister dst = insn.dst(0);<br>
> + uint32_t regType = dst.type;<br>
> + uint32_t simdWidth = p->curr.execWidth;<br>
> + const uint32_t header = insn.extra.scratchMsgHeader;<br>
> + uint32_t size = typeSize(regType);<br>
> + assert(size <= 4);<br>
> + uint32_t regNum = (stride(dst.hstride)*size*simdWidth) > 32 ? 2 : 1;<br>
> + const GenRegister msg = GenRegister::ud8grf(header, 0);<br>
> + this->scratchRead(GenRegister::retype(dst, GEN_TYPE_UD), msg,<br>
> + scratchOffset, regNum, regType, GEN_SCRATCH_CHANNEL_MODE_DWORD); }<br>
> +<br>
</div></div><div class="HOEnZb"><div class="h5">> // For SIMD8, we allocate 2*elemNum temporary registers from dst(0), and<br>
> // then follow the real destination registers.<br>
> // For SIMD16, we allocate elemNum temporary registers from dst(0).<br>
> diff --git a/backend/src/backend/gen_context.hpp<br>
> b/backend/src/backend/gen_context.hpp<br>
> index bcf0dc4..694ae98 100644<br>
> --- a/backend/src/backend/gen_context.hpp<br>
> +++ b/backend/src/backend/gen_context.hpp<br>
> @@ -108,6 +108,8 @@ namespace gbe<br>
> void emitByteScatterInstruction(const SelectionInstruction &insn);<br>
> void emitSampleInstruction(const SelectionInstruction &insn);<br>
> void emitTypedWriteInstruction(const SelectionInstruction &insn);<br>
> + void emitSpillRegInstruction(const SelectionInstruction &insn);<br>
> + void emitUnSpillRegInstruction(const SelectionInstruction &insn);<br>
> void emitGetImageInfoInstruction(const SelectionInstruction &insn);<br>
> void scratchWrite(const GenRegister header, uint32_t offset, uint32_t reg_num, uint32_t reg_type, uint32_t channel_mode);<br>
> void scratchRead(const GenRegister dst, const GenRegister header,<br>
> uint32_t offset, uint32_t reg_num, uint32_t reg_type, uint32_t<br>
> channel_mode); diff --git<br>
> a/backend/src/backend/gen_insn_gen7_schedule_info.hxx<br>
> b/backend/src/backend/gen_insn_gen7_schedule_info.hxx<br>
> index 6f37c3d..da8f2a2 100644<br>
> --- a/backend/src/backend/gen_insn_gen7_schedule_info.hxx<br>
> +++ b/backend/src/backend/gen_insn_gen7_schedule_info.hxx<br>
> @@ -20,5 +20,7 @@ DECL_GEN7_SCHEDULE(ByteGather, 80, 1, 1)<br>
> DECL_GEN7_SCHEDULE(ByteScatter, 80, 1, 1)<br>
> DECL_GEN7_SCHEDULE(Sample, 80, 1, 1)<br>
> DECL_GEN7_SCHEDULE(TypedWrite, 80, 1, 1)<br>
> +DECL_GEN7_SCHEDULE(SpillReg, 80, 1, 1)<br>
> +DECL_GEN7_SCHEDULE(UnSpillReg, 80, 1, 1)<br>
> DECL_GEN7_SCHEDULE(GetImageInfo, 20, 4, 2)<br>
> DECL_GEN7_SCHEDULE(Atomic, 80, 1, 1)<br>
> diff --git a/backend/src/backend/gen_insn_scheduling.cpp<br>
> b/backend/src/backend/gen_insn_scheduling.cpp<br>
> index cb990be..0b720b7 100644<br>
> --- a/backend/src/backend/gen_insn_scheduling.cpp<br>
> +++ b/backend/src/backend/gen_insn_scheduling.cpp<br>
> @@ -283,19 +283,24 @@ namespace gbe<br>
> uint32_t DependencyTracker::getIndex(GenRegister reg) const {<br>
> // Non GRF physical register<br>
> if (reg.physical) {<br>
> - GBE_ASSERT (reg.file == GEN_ARCHITECTURE_REGISTER_FILE);<br>
> - const uint32_t file = <a href="http://reg.nr" target="_blank">reg.nr</a> & 0xf0;<br>
> - const uint32_t nr = <a href="http://reg.nr" target="_blank">reg.nr</a> & 0x0f;<br>
> - if (file == GEN_ARF_FLAG) {<br>
> - const uint32_t subnr = reg.subnr / sizeof(uint16_t);<br>
> - GBE_ASSERT(nr < MAX_FLAG_REGISTER && (subnr == 0 || subnr == 1));<br>
> - return grfNum + 2*nr + subnr;<br>
> - } else if (file == GEN_ARF_ACCUMULATOR) {<br>
> - GBE_ASSERT(nr < MAX_ACC_REGISTER);<br>
> - return grfNum + MAX_FLAG_REGISTER + nr;<br>
> + //GBE_ASSERT (reg.file == GEN_ARCHITECTURE_REGISTER_FILE);<br>
> + if(reg.file == GEN_ARCHITECTURE_REGISTER_FILE) {<br>
> + const uint32_t file = <a href="http://reg.nr" target="_blank">reg.nr</a> & 0xf0;<br>
> + const uint32_t nr = <a href="http://reg.nr" target="_blank">reg.nr</a> & 0x0f;<br>
> + if (file == GEN_ARF_FLAG) {<br>
> + const uint32_t subnr = reg.subnr / sizeof(uint16_t);<br>
> + GBE_ASSERT(nr < MAX_FLAG_REGISTER && (subnr == 0 || subnr == 1));<br>
> + return grfNum + 2*nr + subnr;<br>
> + } else if (file == GEN_ARF_ACCUMULATOR) {<br>
> + GBE_ASSERT(nr < MAX_ACC_REGISTER);<br>
> + return grfNum + MAX_FLAG_REGISTER + nr;<br>
> + } else {<br>
> + NOT_SUPPORTED;<br>
> + return 0;<br>
> + }<br>
> } else {<br>
> - NOT_SUPPORTED;<br>
> - return 0;<br>
> + const uint32_t simdWidth = scheduler.ctx.getSimdWidth();<br>
> + return simdWidth == 8 ? <a href="http://reg.nr" target="_blank">reg.nr</a> : <a href="http://reg.nr" target="_blank">reg.nr</a> / 2;<br>
> }<br>
> }<br>
> // We directly manipulate physical GRFs here @@ -344,6 +349,10 @@<br>
> namespace gbe<br>
> this->nodes[index] = node;<br>
> }<br>
><br>
> + if(insn.opcode == SEL_OP_SPILL_REG) {<br>
> + const uint32_t index = this->getIndex(0xff);<br>
> + this->nodes[index] = node;<br>
> + }<br>
> // Consider barriers and wait write to memory<br>
> if (insn.opcode == SEL_OP_BARRIER ||<br>
> insn.opcode == SEL_OP_FENCE || @@ -424,6 +433,11 @@ namespace<br>
> gbe<br>
> const uint32_t index = tracker.getIndex(insn.extra.function);<br>
> tracker.addDependency(node, index);<br>
> }<br>
> + //read-after-write of scratch memory<br>
> + if (insn.opcode == SEL_OP_UNSPILL_REG) {<br>
> + const uint32_t index = tracker.getIndex(0xff);<br>
> + tracker.addDependency(node, index);<br>
> + }<br>
><br>
> // Consider barriers and wait are reading memory (local and global)<br>
> if (insn.opcode == SEL_OP_BARRIER || @@ -453,6 +467,7 @@<br>
> namespace gbe<br>
> tracker.addDependency(node, index);<br>
> }<br>
><br>
> +<br>
> // Consider barriers and wait are writing memory (local and global)<br>
> if (insn.opcode == SEL_OP_BARRIER ||<br>
> insn.opcode == SEL_OP_FENCE || diff --git<br>
> a/backend/src/backend/gen_insn_selection.cpp<br>
> b/backend/src/backend/gen_insn_selection.cpp<br>
> index 7e9402d..3610051 100644<br>
> --- a/backend/src/backend/gen_insn_selection.cpp<br>
> +++ b/backend/src/backend/gen_insn_selection.cpp<br>
> @@ -315,6 +315,8 @@ namespace gbe<br>
> INLINE ir::Register replaceSrc(SelectionInstruction *insn, uint32_t regID);<br>
> /*! Implement public class */<br>
> INLINE ir::Register replaceDst(SelectionInstruction *insn,<br>
> uint32_t regID);<br>
> + /*! spill a register (insert spill/unspill instructions) */<br>
> + INLINE void spillReg(ir::Register reg, uint32_t registerPool);<br>
> /*! Implement public class */<br>
> INLINE uint32_t getRegNum(void) const { return file.regNum(); }<br>
> /*! Implements public interface */ @@ -617,6 +619,57 @@ namespace<br>
> gbe<br>
> return vector;<br>
> }<br>
><br>
> + void Selection::Opaque::spillReg(ir::Register spilledReg, uint32_t registerPool) {<br>
> + assert(registerPool != 0);<br>
> + const uint32_t simdWidth = ctx.getSimdWidth();<br>
> + const uint32_t dstStart = registerPool + 1;<br>
> + const uint32_t srcStart = registerPool + 1;<br>
> + uint32_t ptr =<br>
> + ctx.allocateScratchMem(typeSize(GEN_TYPE_D)*simdWidth);<br>
> +<br>
> + for (auto &block : blockList)<br>
> + for (auto &insn : block.insnList) {<br>
> + const uint32_t srcNum = insn.srcNum, dstNum = insn.dstNum;<br>
> +<br>
> + for (uint32_t srcID = 0; srcID < srcNum; ++srcID) {<br>
> + const GenRegister selReg = insn.src(srcID);<br>
> + const ir::Register reg = selReg.reg();<br>
> + if(selReg.file == GEN_GENERAL_REGISTER_FILE && reg == spilledReg) {<br>
> + GBE_ASSERT(srcID < 5);<br>
> + SelectionInstruction *unspill = this->create(SEL_OP_UNSPILL_REG, 1, 0);<br>
> + unspill->state = GenInstructionState(simdWidth);<br>
> + unspill->dst(0) = GenRegister(GEN_GENERAL_REGISTER_FILE, srcStart+srcID, 0,<br>
> + selReg.type, selReg.vstride, selReg.width, selReg.hstride);<br>
> + GenRegister src = insn.src(srcID);<br>
> + // change nr/subnr, keep other register settings<br>
> + <a href="http://src.nr" target="_blank">src.nr</a> = srcStart+srcID; src.subnr=0; src.physical=1;<br>
> + insn.src(srcID) = src;<br>
> + unspill->extra.scratchOffset = ptr;<br>
> + unspill->extra.scratchMsgHeader = registerPool;<br>
> + insn.prepend(*unspill);<br>
> + }<br>
> + }<br>
> +<br>
> + for (uint32_t dstID = 0; dstID < dstNum; ++dstID) {<br>
> + const GenRegister selReg = insn.dst(dstID);<br>
> + const ir::Register reg = selReg.reg();<br>
> + if(selReg.file == GEN_GENERAL_REGISTER_FILE && reg == spilledReg) {<br>
> + GBE_ASSERT(dstID < 5);<br>
> + SelectionInstruction *spill = this->create(SEL_OP_SPILL_REG, 0, 1);<br>
> + spill->state = GenInstructionState(simdWidth);<br>
> + spill->src(0) =GenRegister(GEN_GENERAL_REGISTER_FILE, dstStart + dstID, 0,<br>
> + selReg.type, selReg.vstride, selReg.width, selReg.hstride);<br>
> + GenRegister dst = insn.dst(dstID);<br>
> + // change nr/subnr, keep other register settings<br>
> + dst.physical =1; <a href="http://dst.nr" target="_blank">dst.nr</a> = dstStart+dstID; dst.subnr = 0;<br>
> + insn.dst(dstID)= dst;<br>
> + spill->extra.scratchOffset = ptr;<br>
> + spill->extra.scratchMsgHeader = registerPool;<br>
> + insn.append(*spill);<br>
> + }<br>
> + }<br>
> + }<br>
> + }<br>
> +<br>
> ir::Register Selection::Opaque::replaceSrc(SelectionInstruction *insn, uint32_t regID) {<br>
> SelectionBlock *block = insn->parent;<br>
> const uint32_t simdWidth = ctx.getSimdWidth(); @@ -820,7 +873,6<br>
> @@ namespace gbe<br>
> dstVector->regNum = elemNum;<br>
> dstVector->isSrc = 0;<br>
> dstVector->reg = &insn->dst(0);<br>
> -<br>
> // Source cannot be scalar (yet)<br>
> srcVector->regNum = 1;<br>
> srcVector->isSrc = 1;<br>
> @@ -1188,6 +1240,9 @@ namespace gbe<br>
> ir::Register Selection::replaceDst(SelectionInstruction *insn, uint32_t regID) {<br>
> return this->opaque->replaceDst(insn, regID);<br>
> }<br>
> + void Selection::spillReg(ir::Register reg, uint32_t registerPool) {<br>
> + this->opaque->spillReg(reg, registerPool); }<br>
><br>
</div></div><div class="HOEnZb"><div class="h5">> SelectionInstruction *Selection::create(SelectionOpcode opcode, uint32_t dstNum, uint32_t srcNum) {<br>
> return this->opaque->create(opcode, dstNum, srcNum); diff --git<br>
> a/backend/src/backend/gen_insn_selection.hpp<br>
> b/backend/src/backend/gen_insn_selection.hpp<br>
> index 5ae6e42..79b73e2 100644<br>
> --- a/backend/src/backend/gen_insn_selection.hpp<br>
> +++ b/backend/src/backend/gen_insn_selection.hpp<br>
> @@ -107,6 +107,10 @@ namespace gbe<br>
> /*! offset (0 to 7) */<br>
> uint16_t offset:5;<br>
> };<br>
> + struct {<br>
> + uint16_t scratchOffset;<br>
> + uint16_t scratchMsgHeader;<br>
> + };<br>
> } extra;<br>
> /*! Gen opcode */<br>
> uint8_t opcode;<br>
> @@ -197,6 +201,8 @@ namespace gbe<br>
> ir::Register replaceSrc(SelectionInstruction *insn, uint32_t regID);<br>
> /*! Replace a destination to the returned temporary register */<br>
> ir::Register replaceDst(SelectionInstruction *insn, uint32_t<br>
> regID);<br>
> + /*! spill a register (insert spill/unspill instructions) */<br>
> + void spillReg(ir::Register reg, uint32_t registerPool);<br>
> /*! Create a new selection instruction */<br>
> SelectionInstruction *create(SelectionOpcode, uint32_t dstNum, uint32_t srcNum);<br>
> /*! List of emitted blocks */<br>
> diff --git a/backend/src/backend/gen_insn_selection.hxx<br>
> b/backend/src/backend/gen_insn_selection.hxx<br>
> index 7664c8f..495978f 100644<br>
> --- a/backend/src/backend/gen_insn_selection.hxx<br>
> +++ b/backend/src/backend/gen_insn_selection.hxx<br>
> @@ -48,6 +48,8 @@ DECL_SELECTION_IR(BYTE_SCATTER,<br>
> ByteScatterInstruction) DECL_SELECTION_IR(SAMPLE, SampleInstruction)<br>
> DECL_SELECTION_IR(TYPED_WRITE, TypedWriteInstruction)<br>
> DECL_SELECTION_IR(GET_IMAGE_INFO, GetImageInfoInstruction)<br>
> +DECL_SELECTION_IR(SPILL_REG, SpillRegInstruction)<br>
> +DECL_SELECTION_IR(UNSPILL_REG, UnSpillRegInstruction)<br>
> DECL_SELECTION_IR(MUL_HI, TernaryInstruction) DECL_SELECTION_IR(FBH,<br>
> UnaryInstruction) DECL_SELECTION_IR(FBL, UnaryInstruction) diff --git<br>
> a/backend/src/backend/gen_reg_allocation.cpp<br>
> b/backend/src/backend/gen_reg_allocation.cpp<br>
> index 4ba03ea..ccbc0da 100644<br>
> --- a/backend/src/backend/gen_reg_allocation.cpp<br>
> +++ b/backend/src/backend/gen_reg_allocation.cpp<br>
> @@ -31,6 +31,8 @@<br>
> #include <algorithm><br>
> #include <climits><br>
><br>
> +#define RESERVED_REG_NUM_FOR_SPILL 6<br>
> +<br>
> namespace gbe<br>
> {<br>
><br>
> //////////////////////////////////////////////////////////////////////<br>
> ///////<br>
> @@ -94,6 +96,10 @@ namespace gbe<br>
> vector<GenRegInterval*> starting;<br>
> /*! Intervals sorting based on ending point positions */<br>
> vector<GenRegInterval*> ending;<br>
> + /*! registers that are spilled */<br>
> + set<ir::Register> spilled;<br>
> + /* reserved registers for register spill/reload */<br>
> + uint32_t reservedReg;<br>
> /*! Current vector to expire */<br>
> uint32_t expiringID;<br>
> /*! Use custom allocator */<br>
> @@ -259,6 +265,11 @@ namespace gbe<br>
> continue;<br>
> }<br>
><br>
> + //ignore register that already spilled<br>
> + if(spilled.contains(reg)) {<br>
> + this->expiringID++;<br>
> + continue;<br>
> + }<br>
> // Ignore booleans that were allocated with flags<br>
> // if (ctx.getRegisterFamily(reg) == ir::FAMILY_BOOL && !grfBooleans.contains(reg)) {<br>
> if (ctx.sel->getRegisterFamily(reg) == ir::FAMILY_BOOL) { @@<br>
> -473,6 +484,9 @@ namespace gbe<br>
> auto it = vectorMap.find(reg);<br>
> if (it != vectorMap.end()) {<br>
> const SelectionVector *vector = it->second.first;<br>
> + // all the reg in the SelectionVector are spilled<br>
> + if(spilled.contains(vector->reg[0].reg()))<br>
> + continue;<br>
> const uint32_t simdWidth = ctx.getSimdWidth();<br>
><br>
> const ir::RegisterData regData =<br>
> ctx.sel->getRegisterData(reg); @@ -481,10 +495,28 @@ namespace gbe<br>
> const uint32_t alignment = simdWidth*typeSize;<br>
><br>
> const uint32_t size = vector->regNum * alignment;<br>
> +<br>
> uint32_t grfOffset;<br>
> while ((grfOffset = ctx.allocate(size, alignment)) == 0) {<br>
> const bool success = this->expireGRF(interval);<br>
> - if (success == false) return false;<br>
> + if (success == false) {<br>
> + // if no spill support, just return false, else simply spill the register<br>
> + if(reservedReg == 0) return false;<br>
> + break;<br>
> + }<br>
> + }<br>
> + if(grfOffset == 0) {<br>
> + // spill all the registers in the SelectionVector<br>
> + // the tricky here is I need to use reservedReg+1 as scratch write payload.<br>
> + // so, i need to write the first register to scratch memory first.<br>
> + // the spillReg() will just append scratch write insn after the def. To spill<br>
> + // the first register, need to call spillReg() last for the vector->reg[0]<br>
> + GBE_ASSERT(vector->regNum < RESERVED_REG_NUM_FOR_SPILL);<br>
> + for(int i = vector->regNum-1; i >= 0; i--) {<br>
> + spilled.insert(vector->reg[i].reg());<br>
> + selection.spillReg(vector->reg[i].reg(), reservedReg);<br>
> + }<br>
> + continue;<br>
> }<br>
> for (uint32_t regID = 0; regID < vector->regNum; ++regID, grfOffset += alignment) {<br>
> const ir::Register reg = vector->reg[regID].reg(); @@<br>
> -494,18 +526,25 @@ namespace gbe<br>
> }<br>
> }<br>
> // Case 2: This is a regular scalar register, allocate it alone<br>
> - else if (this->createGenReg(interval) == false)<br>
> - return false;<br>
> + else if (this->createGenReg(interval) == false) {<br>
> + if(reservedReg == 0) return false;<br>
> + spilled.insert(reg);<br>
> + selection.spillReg(reg, reservedReg);<br>
> + }<br>
> }<br>
> return true;<br>
> }<br>
> -<br>
> INLINE bool GenRegAllocator::Opaque::allocate(Selection &selection) {<br>
> using namespace ir;<br>
> const Kernel *kernel = ctx.getKernel();<br>
> const Function &fn = ctx.getFunction();<br>
> GBE_ASSERT(fn.getProfile() == PROFILE_OCL);<br>
> -<br>
> + if (ctx.getSimdWidth() == 8) {<br>
> + reservedReg = ctx.allocate(RESERVED_REG_NUM_FOR_SPILL * GEN_REG_SIZE, GEN_REG_SIZE);<br>
> + reservedReg /= GEN_REG_SIZE;<br>
> + } else {<br>
> + reservedReg = 0;<br>
> + }<br>
> // Allocate all the vectors first since they need to be contiguous<br>
> this->allocateVector(selection);<br>
> // schedulePreRegAllocation(ctx, selection); @@ -690,6 +729,10 @@<br>
> namespace gbe<br>
> int subreg = offst % 8;<br>
> std::cout << "%" << vReg << " g" << reg << "." << subreg << "D" << std::endl;<br>
> }<br>
> + std::set<ir::Register>::iterator is;<br>
> + std::cout << "## spilled registers:" << std::endl;<br>
> + for(is = spilled.begin(); is != spilled.end(); is++)<br>
> + std::cout << (int)*is << std::endl;<br>
> std::cout << std::endl;<br>
> }<br>
><br>
> @@ -704,6 +747,9 @@ namespace gbe<br>
><br>
> INLINE GenRegister GenRegAllocator::Opaque::genReg(const GenRegister ®) {<br>
> if (reg.file == GEN_GENERAL_REGISTER_FILE) {<br>
> + if(reg.physical == 1) {<br>
> + return reg;<br>
> + }<br>
> GBE_ASSERT(RA.contains(reg.reg()) != false);<br>
> const uint32_t grfOffset = RA.find(reg.reg())->second;<br>
> const GenRegister dst = setGenReg(reg, grfOffset);<br>
> --<br>
> 1.7.9.5<br>
><br>
> _______________________________________________<br>
> Beignet mailing list<br>
> <a href="mailto:Beignet@lists.freedesktop.org">Beignet@lists.freedesktop.org</a><br>
> <a href="http://lists.freedesktop.org/mailman/listinfo/beignet" target="_blank">http://lists.freedesktop.org/mailman/listinfo/beignet</a><br>
_______________________________________________<br>
Beignet mailing list<br>
<a href="mailto:Beignet@lists.freedesktop.org">Beignet@lists.freedesktop.org</a><br>
<a href="http://lists.freedesktop.org/mailman/listinfo/beignet" target="_blank">http://lists.freedesktop.org/mailman/listinfo/beignet</a><br>
</div></div></blockquote></div><br></div>