[Beignet] [PATCH] Add memory fence before barrier to support global memory barrier.
Zhigang Gong
zhigang.gong at linux.intel.com
Tue Jun 18 03:19:06 PDT 2013
On Tue, Jun 18, 2013 at 11:54:15AM +0200, Dag Lem wrote:
> Zhigang Gong <zhigang.gong at linux.intel.com> writes:
>
> > On Tue, Jun 18, 2013 at 10:10:08AM +0200, Dag Lem wrote:
> >> Dag Lem <dag at nimrod.no> writes:
> >>
> >> [...]
> >>
> >> > If I understand you correctly, in order to calculate the maximum
> >> > allocatable local memory size, you should rather calculate the maximum
> >> > number of thread groups running in parallel per half-slice, and divide
> >> > the maximum allocatable half-slice SLM size by this number.
> >>
> >> Hmm, or will the GPGPU walker automatically limit the number of thread
> >> groups running in parallel based on the requested local memory size?
> > Yeah, I think so.
>
> I just read the section you pointed me to in "IVB - Volume 2 Part 2:
> Media and General Purpose Pipeline": "1.5.1.10.2 Shared Local Memory
> Allocation".
>
> This is really nice, provided that one can read out of this that a
> thread group is only started when the requested local memory block size
> (e.g. 4KB) is available for the thread group.
>
> However, isn't Beignet rather using the mechanism described in the
> subsequent section "1.5.1.10.3 Software Managed Shared Local Memory"?
> If so, won't this be a recipe for disaster for OpenCL ;-)
Beignet does use the automatically SLM allocation. You can check the
intel_gpgpu.c, and we are using GPGPU_WALKER rather then the GPGPU_OBJECT.
Only the GPGPU_OBJECT command can do the software managed shared local memory.
By using GPGPU_WALKER method, software only tell the GEN that how many SLM
this thread group required. And software never try to set a specified SLM
offset for each thread.
>
> --
> Dag
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet
More information about the Beignet
mailing list