[Beignet] [PATCH] Remove global barrier assert.

Zou, Nanhai nanhai.zou at intel.com
Thu Jun 13 23:44:22 PDT 2013


Please check  
Shared Functions > Data Port [Pre-DevSKL] > Read/Write Ordering

Reads and writes issued from the same thread are guaranteed to be processed in the same order as issued. Software mechanisms must still ensure any needed ordering of accesses issued from different threads.


-----Original Message-----
From: Zhigang Gong [mailto:zhigang.gong at linux.intel.com] 
Sent: Friday, June 14, 2013 2:13 PM
To: Yang, Rong R; Zou, Nanhai
Cc: beignet at lists.freedesktop.org
Subject: RE: [Beignet] [PATCH] Remove global barrier assert.

Rong,

Please see my comments:

> -----Original Message-----
> From: Yang, Rong R [mailto:rong.r.yang at intel.com]
> Sent: Friday, June 14, 2013 1:38 PM
> To: Zou, Nanhai; Zhigang Gong
> Cc: beignet at lists.freedesktop.org
> Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> 
> Be sure that mem fence mentioned in bspec is not the same mem fence 
> mentioned in openCL spec.
> 
> Mem fence in bspec:
> A memory fence message issued by a thread causes further messages 
> issued by the thread to be blocked until all previous messages issued 
> by the
thread to
> that data port (data cache or render cache) have been globally 
> observed
from
> the point of view of other threads in the system.
[Gong, Zhigang] As on IVB, each memory access is done by sending message to data port.
So the above statement is exactly the low level mechanism to support the memory fence defined in the OpenCL spec. 
> 
> Mem fence in openCL spec:
> Orders loads and stores of a work-item executing a kernel.
[Gong, Zhigang] Right, the load and store on the same kernel are two send message instruction.
Let's assume the first send message is to write to src, and the second message is to read from.
With barrier, we can only make sure the second send instruction will be executed after the first one, But by default, the send message is executed asynchronous, and the barrier can't guarantee the second first send message already complete before the second send message start to execute. That's the reason why GEN provide the memory fence.

Nanhai and Rong, I really can't find the statement in BPSEC to support your point which means the IVB hardware will guarantee the in order read/write (send message). Could you give me a pointer to It?

Thanks.

> 
> Mem fence in openCL is only for a work-item semantic, and is 
> guaranteed by hardware.
> 
> -----Original Message-----
> From: Zou, Nanhai
> Sent: Friday, June 14, 2013 1:20 PM
> To: Zhigang Gong; Yang, Rong R
> Cc: beignet at lists.freedesktop.org
> Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> 
> Rong and I have discuss the mem fence topic result the mem fence and
global
> barrier patch.
> 
> R/W order is ensured in one thread.
> So no need to put this in mem fence built-in funtion,
> 
> Mem Fence semantic with in a work group is included in barrier 
> according
to
> our test, so no need send an additional message to slow the performance.
> 
> 
> Thanks
> Zou Nanhai
> 
> 
> -----Original Message-----
> From: Zhigang Gong [mailto:zhigang.gong at linux.intel.com]
> Sent: Friday, June 14, 2013 1:13 PM
> To: Zou, Nanhai; Yang, Rong R
> Cc: beignet at lists.freedesktop.org
> Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> 
> I found something in bspec as below. You can check the last sentence. 
> As
Rong
> is doing the related work items, I would ask rong to follow the memory
fence
> for global memory. Rong, is it ok for you?
> 
> Programming Note:
> [DevIVB, DevVLV, DevVLV-T, DevHSW] The memory fence operation is not 
> required to guarantee SLM memory access ordering between multiple 
> threads in a thread group for the sequence of a write message, a 
> barrier message,
and
> then a read message. (This optimization is due to implementation 
> details
of the
> organization of threads in a thread group, SLM memory, data port 
> messages and gateway barrier messages.) Beware that the memory fence 
> is still
required
> for non-SLM memory ordering and observability.
> 
> > -----Original Message-----
> > From: Zou, Nanhai [mailto:nanhai.zou at intel.com]
> > Sent: Friday, June 14, 2013 1:04 PM
> > To: Zhigang Gong; Yang, Rong R
> > Cc: beignet at lists.freedesktop.org
> > Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> >
> > Per my understanding,
> > barrier has included memory fence semantic, no need to insert 
> > additional
> mem
> > fence message.
> > I think Rong has done some experiment on that.
> >
> >
> > -----Original Message-----
> > From: Zhigang Gong [mailto:zhigang.gong at linux.intel.com]
> > Sent: Friday, June 14, 2013 1:00 PM
> > To: Zou, Nanhai; Yang, Rong R
> > Cc: beignet at lists.freedesktop.org
> > Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> >
> > Nanhai,
> >
> > I'm not worry about the barrier itself. I'm worry about the memory 
> > fence
> part.
> > Please refer the following statement from Open CL spec.
> >
> > The barrier CLK_LOCAL_MEM_FENCE -
> > function will either flush any variables stored in local memory or 
> > queue a memory fence to ensure correct ordering of memory operations 
> > to local memory.
> >
> > CLK_GLOBAL_MEM_FENCE - The barrier
> > function will queue a memory fence to ensure correct ordering of 
> > memory operations to global memory. This can be useful when 
> > work-items, for
> example,
> > write to buffer or image objects and then want to read the updated data.
> >
> > And current implementation does nothing for the global memory
operations.
> >
> > And IMHO, if we can't make sure the implementation comply with 
> > OpenCL spec for the specified memory fence function, we can't just 
> > push the
code
> now.
> > It may cause weird problem in the future which is extremely hard to
debug.
> > Any comments?
> >
> > > -----Original Message-----
> > > From: Zou, Nanhai [mailto:nanhai.zou at intel.com]
> > > Sent: Friday, June 14, 2013 12:13 PM
> > > To: Zhigang Gong; Yang, Rong R
> > > Cc: beignet at lists.freedesktop.org
> > > Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> > >
> > > From the spec, it seems that the logic is correct, the barrier is 
> > > not
> > limited to
> > > SHM.
> > > Let's push the code till we hit bug.
> > >
> > > Thanks
> > > Zou Nanhai
> > >
> > > -----Original Message-----
> > > From: beignet-bounces+nanhai.zou=intel.com at lists.freedesktop.org
> > > [mailto:beignet-bounces+nanhai.zou=intel.com at lists.freedesktop.org
> > > ]
> > > On Behalf Of Zhigang Gong
> > > Sent: Friday, June 14, 2013 11:50 AM
> > > To: Yang, Rong R
> > > Cc: beignet at lists.freedesktop.org
> > > Subject: Re: [Beignet] [PATCH] Remove global barrier assert.
> > >
> > > With this patch, you treat a global memory fence the same as a 
> > > local
> > memory
> > > fence.
> > > I haven't check the bspec details, but I am doubt it. Need some 
> > > time to investigate the spec.
> > >
> > > On Fri, Jun 14, 2013 at 11:01:09AM +0800, Yang Rong wrote:
> > > > Per openCL spec, Global memory barrier is consistent across 
> > > > work-items in a single work group, which is match the bspec's 
> > > > barrier. So remove global barrier assert.
> > > >
> > > > Signed-off-by: Yang Rong <rong.r.yang at intel.com>
> > > > ---
> > > >  backend/src/backend/gen_insn_selection.cpp |    4 ----
> > > >  1 file changed, 4 deletions(-)
> > > >
> > > > diff --git a/backend/src/backend/gen_insn_selection.cpp
> > > > b/backend/src/backend/gen_insn_selection.cpp
> > > > index 88f9e94..3a139ea 100644
> > > > --- a/backend/src/backend/gen_insn_selection.cpp
> > > > +++ b/backend/src/backend/gen_insn_selection.cpp
> > > > @@ -1607,10 +1607,6 @@ namespace gbe
> > > >      INLINE bool emitOne(Selection::Opaque &sel, const
> > ir::SyncInstruction
> > > &insn) const
> > > >      {
> > > >        using namespace ir;
> > > > -      const uint32_t params = insn.getParameters();
> > > > -      GBE_ASSERTM(params == syncLocalBarrier,
> > > > -                  "Only barrier(CLK_LOCAL_MEM_FENCE) is
> > supported
> > > right now "
> > > > -                  "for the synchronization primitives");
> > > >        const ir::Register reg = sel.reg(FAMILY_DWORD);
> > > >
> > > >        sel.push();
> > > > --
> > > > 1.7.10.4
> > > >
> > > > _______________________________________________
> > > > Beignet mailing list
> > > > Beignet at lists.freedesktop.org
> > > > http://lists.freedesktop.org/mailman/listinfo/beignet
> > > _______________________________________________
> > > Beignet mailing list
> > > Beignet at lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/beignet



More information about the Beignet mailing list