[Beignet] [PATCH] Add memory fence before barrier to support global memory barrier.

Dag Lem dag at nimrod.no
Tue Jun 18 01:06:29 PDT 2013


Zhigang Gong <zhigang.gong at linux.intel.com> writes:

> On Tue, Jun 18, 2013 at 09:09:51AM +0200, Dag Lem wrote:

[...]

>> If more than one thread group (= work group) is dispatched at one time,
>> and all thread groups see the same "local" memory - boom!
>
> No, please check the IVB's manual IHD_OS_Vol2_Part2:
>
> section: 1.5.1.10.2 Shared Local Memory Allocation:
>
> The first thread of a Thread Group is marked as requiring a new shared local memory – if not the old Shared Local Memory offset is sent with the dispatch.
>
> So when disptach each thead gropu's first thread, it will automatically allocate a new SLM buffer for it.
> Thus when it disptaches more than one work group to the same half-slice, each work group get different
> SLM region.

OK, great! I apologize for my lack of knowledge here.

However in that case, you still have a problem with (global) allocation
of SLM in Beignet. Right now, there is just a check (which I added) to
ensure that the total local memory associated with kernel parameters
does not exceed the total half-slice SLM size.

If I understand you correctly, in order to calculate the maximum
allocatable local memory size, you should rather calculate the maximum
number of thread groups running in parallel per half-slice, and divide
the maximum allocatable half-slice SLM size by this number.

-- 
Dag


More information about the Beignet mailing list