[Beignet] [PATCH 1/2] GBE: fixed a barrier related bug.
Sun, Yi
yi.sun at intel.com
Tue Jul 2 19:32:41 PDT 2013
Patch works well. Case compiler_global_memory_barrier is passed with it.
Thanks
--Sun, Yi
On Tue, 2013-07-02 at 18:49 +0800, Zhigang Gong wrote:
> Actually, this commit fixed two bugs related to barrier.
> 1. We should set useSLM to true if we use barrier.
> 2. We need to set barrier id to the barrierMsg payload according to
> r0.2. And we don't need to reprogram the barrierCount.
>
> And after this fix, we don't need the work around for the local
> memory barrier, thus we don't need the memory fence for local memory
> barrier.
>
> Signed-off-by: Zhigang Gong <zhigang.gong at linux.intel.com>
> ---
> backend/src/backend/gen_insn_selection.cpp | 14 ++++++--------
> backend/src/llvm/llvm_gen_backend.cpp | 1 +
> 2 files changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/backend/src/backend/gen_insn_selection.cpp b/backend/src/backend/gen_insn_selection.cpp
> index bbe392d..bfe1e28 100644
> --- a/backend/src/backend/gen_insn_selection.cpp
> +++ b/backend/src/backend/gen_insn_selection.cpp
> @@ -1792,24 +1792,22 @@ namespace gbe
> const ir::Register reg = sel.reg(FAMILY_DWORD);
>
> const uint32_t params = insn.getParameters();
> - //XXX TODO need to double check local barrier whether need fence or not
> - if(params == syncGlobalBarrier || params == syncLocalBarrier) {
> + if(params == syncGlobalBarrier) {
> const ir::Register fenceDst = sel.reg(FAMILY_DWORD);
> sel.FENCE(sel.selReg(fenceDst, ir::TYPE_U32));
> }
>
> sel.push();
> sel.curr.predicate = GEN_PREDICATE_NONE;
> +
> + // As only the payload.2 is used and all the other regions are ignored
> + // SIMD8 mode here is safe.
> sel.curr.execWidth = 8;
> sel.curr.physicalFlag = 0;
> sel.curr.noMask = 1;
> + // Copy barrier id from r0.
> + sel.AND(GenRegister::ud8grf(reg), GenRegister::ud1grf(ir::ocl::barrierid), GenRegister::immud(0x0f000000));
>
> - sel.SHL(GenRegister::ud8grf(reg),
> - GenRegister::ud1grf(ocl::threadn),
> - GenRegister::immud(0x9));
> - sel.OR(GenRegister::ud8grf(reg),
> - GenRegister::ud8grf(reg),
> - GenRegister::immud(0x00088000));
> // A barrier is OK to start the thread synchronization *and* SLM fence
> sel.BARRIER(GenRegister::f8grf(reg));
> // Now we wait for the other threads
> diff --git a/backend/src/llvm/llvm_gen_backend.cpp b/backend/src/llvm/llvm_gen_backend.cpp
> index 8385e21..db34296 100644
> --- a/backend/src/llvm/llvm_gen_backend.cpp
> +++ b/backend/src/llvm/llvm_gen_backend.cpp
> @@ -1741,6 +1741,7 @@ namespace gbe
> case GEN_OCL_LBARRIER:
> case GEN_OCL_GBARRIER:
> case GEN_OCL_LGBARRIER:
> + ctx.getFunction().setUseSLM(true);
> break;
> case GEN_OCL_WRITE_IMAGE0:
> case GEN_OCL_WRITE_IMAGE1:
More information about the Beignet
mailing list