[Beignet] [PATCH] Remove global barrier assert.

Zhigang Gong zhigang.gong at linux.intel.com
Thu Jun 13 23:13:23 PDT 2013


Rong,

Please see my comments:

> -----Original Message-----
> From: Yang, Rong R [mailto:rong.r.yang at intel.com]
> Sent: Friday, June 14, 2013 1:38 PM
> To: Zou, Nanhai; Zhigang Gong
> Cc: beignet at lists.freedesktop.org
> Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> 
> Be sure that mem fence mentioned in bspec is not the same mem fence
> mentioned in openCL spec.
> 
> Mem fence in bspec:
> A memory fence message issued by a thread causes further messages issued
> by the thread to be blocked until all previous messages issued by the
thread to
> that data port (data cache or render cache) have been globally observed
from
> the point of view of other threads in the system.
[Gong, Zhigang] As on IVB, each memory access is done by sending message to
data port.
So the above statement is exactly the low level mechanism to support the
memory fence
defined in the OpenCL spec. 
> 
> Mem fence in openCL spec:
> Orders loads and stores of a work-item executing a kernel.
[Gong, Zhigang] Right, the load and store on the same kernel are two send
message instruction.
Let's assume the first send message is to write to src, and the second
message is to read from.
With barrier, we can only make sure the second send instruction will be
executed after the first one,
But by default, the send message is executed asynchronous, and the barrier
can't guarantee the second
first send message already complete before the second send message start to
execute. That's the
reason why GEN provide the memory fence.

Nanhai and Rong, I really can't find the statement in BPSEC to support your
point which means the
IVB hardware will guarantee the in order read/write (send message). Could
you give me a pointer to
It?

Thanks.

> 
> Mem fence in openCL is only for a work-item semantic, and is guaranteed by
> hardware.
> 
> -----Original Message-----
> From: Zou, Nanhai
> Sent: Friday, June 14, 2013 1:20 PM
> To: Zhigang Gong; Yang, Rong R
> Cc: beignet at lists.freedesktop.org
> Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> 
> Rong and I have discuss the mem fence topic result the mem fence and
global
> barrier patch.
> 
> R/W order is ensured in one thread.
> So no need to put this in mem fence built-in funtion,
> 
> Mem Fence semantic with in a work group is included in barrier according
to
> our test, so no need send an additional message to slow the performance.
> 
> 
> Thanks
> Zou Nanhai
> 
> 
> -----Original Message-----
> From: Zhigang Gong [mailto:zhigang.gong at linux.intel.com]
> Sent: Friday, June 14, 2013 1:13 PM
> To: Zou, Nanhai; Yang, Rong R
> Cc: beignet at lists.freedesktop.org
> Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> 
> I found something in bspec as below. You can check the last sentence. As
Rong
> is doing the related work items, I would ask rong to follow the memory
fence
> for global memory. Rong, is it ok for you?
> 
> Programming Note:
> [DevIVB, DevVLV, DevVLV-T, DevHSW] The memory fence operation is not
> required to guarantee SLM memory access ordering between multiple threads
> in a thread group for the sequence of a write message, a barrier message,
and
> then a read message. (This optimization is due to implementation details
of the
> organization of threads in a thread group, SLM memory, data port messages
> and gateway barrier messages.) Beware that the memory fence is still
required
> for non-SLM memory ordering and observability.
> 
> > -----Original Message-----
> > From: Zou, Nanhai [mailto:nanhai.zou at intel.com]
> > Sent: Friday, June 14, 2013 1:04 PM
> > To: Zhigang Gong; Yang, Rong R
> > Cc: beignet at lists.freedesktop.org
> > Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> >
> > Per my understanding,
> > barrier has included memory fence semantic, no need to insert
> > additional
> mem
> > fence message.
> > I think Rong has done some experiment on that.
> >
> >
> > -----Original Message-----
> > From: Zhigang Gong [mailto:zhigang.gong at linux.intel.com]
> > Sent: Friday, June 14, 2013 1:00 PM
> > To: Zou, Nanhai; Yang, Rong R
> > Cc: beignet at lists.freedesktop.org
> > Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> >
> > Nanhai,
> >
> > I'm not worry about the barrier itself. I'm worry about the memory
> > fence
> part.
> > Please refer the following statement from Open CL spec.
> >
> > The barrier CLK_LOCAL_MEM_FENCE -
> > function will either flush any variables stored in local memory or
> > queue a memory fence to ensure correct ordering of memory operations
> > to local memory.
> >
> > CLK_GLOBAL_MEM_FENCE - The barrier
> > function will queue a memory fence to ensure correct ordering of
> > memory operations to global memory. This can be useful when
> > work-items, for
> example,
> > write to buffer or image objects and then want to read the updated data.
> >
> > And current implementation does nothing for the global memory
operations.
> >
> > And IMHO, if we can't make sure the implementation comply with OpenCL
> > spec for the specified memory fence function, we can't just push the
code
> now.
> > It may cause weird problem in the future which is extremely hard to
debug.
> > Any comments?
> >
> > > -----Original Message-----
> > > From: Zou, Nanhai [mailto:nanhai.zou at intel.com]
> > > Sent: Friday, June 14, 2013 12:13 PM
> > > To: Zhigang Gong; Yang, Rong R
> > > Cc: beignet at lists.freedesktop.org
> > > Subject: RE: [Beignet] [PATCH] Remove global barrier assert.
> > >
> > > From the spec, it seems that the logic is correct, the barrier is
> > > not
> > limited to
> > > SHM.
> > > Let's push the code till we hit bug.
> > >
> > > Thanks
> > > Zou Nanhai
> > >
> > > -----Original Message-----
> > > From: beignet-bounces+nanhai.zou=intel.com at lists.freedesktop.org
> > > [mailto:beignet-bounces+nanhai.zou=intel.com at lists.freedesktop.org]
> > > On Behalf Of Zhigang Gong
> > > Sent: Friday, June 14, 2013 11:50 AM
> > > To: Yang, Rong R
> > > Cc: beignet at lists.freedesktop.org
> > > Subject: Re: [Beignet] [PATCH] Remove global barrier assert.
> > >
> > > With this patch, you treat a global memory fence the same as a local
> > memory
> > > fence.
> > > I haven't check the bspec details, but I am doubt it. Need some time
> > > to investigate the spec.
> > >
> > > On Fri, Jun 14, 2013 at 11:01:09AM +0800, Yang Rong wrote:
> > > > Per openCL spec, Global memory barrier is consistent across
> > > > work-items in a single work group, which is match the bspec's
> > > > barrier. So remove global barrier assert.
> > > >
> > > > Signed-off-by: Yang Rong <rong.r.yang at intel.com>
> > > > ---
> > > >  backend/src/backend/gen_insn_selection.cpp |    4 ----
> > > >  1 file changed, 4 deletions(-)
> > > >
> > > > diff --git a/backend/src/backend/gen_insn_selection.cpp
> > > > b/backend/src/backend/gen_insn_selection.cpp
> > > > index 88f9e94..3a139ea 100644
> > > > --- a/backend/src/backend/gen_insn_selection.cpp
> > > > +++ b/backend/src/backend/gen_insn_selection.cpp
> > > > @@ -1607,10 +1607,6 @@ namespace gbe
> > > >      INLINE bool emitOne(Selection::Opaque &sel, const
> > ir::SyncInstruction
> > > &insn) const
> > > >      {
> > > >        using namespace ir;
> > > > -      const uint32_t params = insn.getParameters();
> > > > -      GBE_ASSERTM(params == syncLocalBarrier,
> > > > -                  "Only barrier(CLK_LOCAL_MEM_FENCE) is
> > supported
> > > right now "
> > > > -                  "for the synchronization primitives");
> > > >        const ir::Register reg = sel.reg(FAMILY_DWORD);
> > > >
> > > >        sel.push();
> > > > --
> > > > 1.7.10.4
> > > >
> > > > _______________________________________________
> > > > Beignet mailing list
> > > > Beignet at lists.freedesktop.org
> > > > http://lists.freedesktop.org/mailman/listinfo/beignet
> > > _______________________________________________
> > > Beignet mailing list
> > > Beignet at lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/beignet



More information about the Beignet mailing list