[Beignet] [PATCH] Add memory fence before barrier to support global memory barrier.

Dag Lem dag at nimrod.no
Tue Jun 18 02:54:15 PDT 2013


Zhigang Gong <zhigang.gong at linux.intel.com> writes:

> On Tue, Jun 18, 2013 at 10:10:08AM +0200, Dag Lem wrote:
>> Dag Lem <dag at nimrod.no> writes:
>> 
>> [...]
>> 
>> > If I understand you correctly, in order to calculate the maximum
>> > allocatable local memory size, you should rather calculate the maximum
>> > number of thread groups running in parallel per half-slice, and divide
>> > the maximum allocatable half-slice SLM size by this number.
>> 
>> Hmm, or will the GPGPU walker automatically limit the number of thread
>> groups running in parallel based on the requested local memory size?
> Yeah, I think so.

I just read the section you pointed me to in "IVB - Volume 2 Part 2:
Media and General Purpose Pipeline": "1.5.1.10.2 Shared Local Memory
Allocation".

This is really nice, provided that one can read out of this that a
thread group is only started when the requested local memory block size
(e.g. 4KB) is available for the thread group.

However, isn't Beignet rather using the mechanism described in the
subsequent section "1.5.1.10.3 Software Managed Shared Local Memory"?
If so, won't this be a recipe for disaster for OpenCL ;-)

-- 
Dag


More information about the Beignet mailing list