[Beignet] [PATCH] Remove global barrier assert.

Zou, Nanhai nanhai.zou at intel.com
Thu Jun 13 22:48:12 PDT 2013


Yes, I have recheck the OpenCL Spec,
The Mem Fence (mem fence built-in and mem fence in barrier)
in OCL spec only reference to r/w order in work-item, that is guaranteed by our IVB+ hardware.



-----Original Message-----
From: Yang, Rong R 
Sent: Friday, June 14, 2013 1:38 PM
To: Zou, Nanhai; Zhigang Gong
Cc: beignet at lists.freedesktop.org
Subject: RE: [Beignet] [PATCH] Remove global barrier assert.

Be sure that mem fence mentioned in bspec is not the same mem fence mentioned in openCL spec.

Mem fence in bspec:
A memory fence message issued by a thread causes further messages issued by the thread to be blocked until all previous messages issued by the thread to that data port (data cache or render cache) have been globally observed from the point of view of other threads in the system.

Mem fence in openCL spec:
Orders loads and stores of a work-item executing a kernel.

Mem fence in openCL is only for a work-item semantic, and is guaranteed by hardware.

-----Original Message-----
From: Zou, Nanhai
Sent: Friday, June 14, 2013 1:20 PM
To: Zhigang Gong; Yang, Rong R
Cc: beignet at lists.freedesktop.org
Subject: RE: [Beignet] [PATCH] Remove global barrier assert.

Rong and I have discuss the mem fence topic result the mem fence and global barrier patch.

R/W order is ensured in one thread.
So no need to put this in mem fence built-in funtion, 

Mem Fence semantic with in a work group is included in barrier according to our test, so no need send an additional message to slow the performance.


Thanks
Zou Nanhai


-----Original Message-----
From: Zhigang Gong [mailto:zhigang.gong at linux.intel.com]
Sent: Friday, June 14, 2013 1:13 PM
To: Zou, Nanhai; Yang, Rong R
Cc: beignet at lists.freedesktop.org
Subject: RE: [Beignet] [PATCH] Remove global barrier assert.

I found something in bspec as below. You can check the last sentence. As Rong is doing the related work items, I would ask rong to follow the memory fence for global memory. Rong, is it ok for you?

Programming Note:
[DevIVB, DevVLV, DevVLV-T, DevHSW] The memory fence operation is not required to guarantee SLM memory access ordering between multiple threads in a thread group for the sequence of a write message, a barrier message, and then a read message. (This optimization is due to implementation details of the organization of threads in a thread group, SLM memory, data port messages and gateway barrier messages.) Beware that the memory fence is still required for non-SLM memory ordering and observability.

> -----Original Message-----
> From: Zou, Nanhai [mailto:nanhai.zou at intel.com]
> Sent: Friday, June 14, 2013 1:04 PM
> To: Zhigang Gong; Yang, Rong R
> Cc: beignet at lists.freedesktop.org
> Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> 
> Per my understanding,
> barrier has included memory fence semantic, no need to insert 
> additional
mem
> fence message.
> I think Rong has done some experiment on that.
> 
> 
> -----Original Message-----
> From: Zhigang Gong [mailto:zhigang.gong at linux.intel.com]
> Sent: Friday, June 14, 2013 1:00 PM
> To: Zou, Nanhai; Yang, Rong R
> Cc: beignet at lists.freedesktop.org
> Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> 
> Nanhai,
> 
> I'm not worry about the barrier itself. I'm worry about the memory 
> fence
part.
> Please refer the following statement from Open CL spec.
> 
> The barrier CLK_LOCAL_MEM_FENCE -
> function will either flush any variables stored in local memory or 
> queue a memory fence to ensure correct ordering of memory operations 
> to local memory.
> 
> CLK_GLOBAL_MEM_FENCE - The barrier
> function will queue a memory fence to ensure correct ordering of 
> memory operations to global memory. This can be useful when 
> work-items, for
example,
> write to buffer or image objects and then want to read the updated data.
> 
> And current implementation does nothing for the global memory operations.
> 
> And IMHO, if we can't make sure the implementation comply with OpenCL 
> spec for the specified memory fence function, we can't just push the code now.
> It may cause weird problem in the future which is extremely hard to debug.
> Any comments?
> 
> > -----Original Message-----
> > From: Zou, Nanhai [mailto:nanhai.zou at intel.com]
> > Sent: Friday, June 14, 2013 12:13 PM
> > To: Zhigang Gong; Yang, Rong R
> > Cc: beignet at lists.freedesktop.org
> > Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> >
> > From the spec, it seems that the logic is correct, the barrier is 
> > not
> limited to
> > SHM.
> > Let's push the code till we hit bug.
> >
> > Thanks
> > Zou Nanhai
> >
> > -----Original Message-----
> > From: beignet-bounces+nanhai.zou=intel.com at lists.freedesktop.org
> > [mailto:beignet-bounces+nanhai.zou=intel.com at lists.freedesktop.org]
> > On Behalf Of Zhigang Gong
> > Sent: Friday, June 14, 2013 11:50 AM
> > To: Yang, Rong R
> > Cc: beignet at lists.freedesktop.org
> > Subject: Re: [Beignet] [PATCH] Remove global barrier assert.
> >
> > With this patch, you treat a global memory fence the same as a local
> memory
> > fence.
> > I haven't check the bspec details, but I am doubt it. Need some time 
> > to investigate the spec.
> >
> > On Fri, Jun 14, 2013 at 11:01:09AM +0800, Yang Rong wrote:
> > > Per openCL spec, Global memory barrier is consistent across 
> > > work-items in a single work group, which is match the bspec's 
> > > barrier. So remove global barrier assert.
> > >
> > > Signed-off-by: Yang Rong <rong.r.yang at intel.com>
> > > ---
> > >  backend/src/backend/gen_insn_selection.cpp |    4 ----
> > >  1 file changed, 4 deletions(-)
> > >
> > > diff --git a/backend/src/backend/gen_insn_selection.cpp
> > > b/backend/src/backend/gen_insn_selection.cpp
> > > index 88f9e94..3a139ea 100644
> > > --- a/backend/src/backend/gen_insn_selection.cpp
> > > +++ b/backend/src/backend/gen_insn_selection.cpp
> > > @@ -1607,10 +1607,6 @@ namespace gbe
> > >      INLINE bool emitOne(Selection::Opaque &sel, const
> ir::SyncInstruction
> > &insn) const
> > >      {
> > >        using namespace ir;
> > > -      const uint32_t params = insn.getParameters();
> > > -      GBE_ASSERTM(params == syncLocalBarrier,
> > > -                  "Only barrier(CLK_LOCAL_MEM_FENCE) is
> supported
> > right now "
> > > -                  "for the synchronization primitives");
> > >        const ir::Register reg = sel.reg(FAMILY_DWORD);
> > >
> > >        sel.push();
> > > --
> > > 1.7.10.4
> > >
> > > _______________________________________________
> > > Beignet mailing list
> > > Beignet at lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/beignet
> > _______________________________________________
> > Beignet mailing list
> > Beignet at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/beignet



More information about the Beignet mailing list