[Beignet] [PATCH] Add memory fence before barrier to support global memory barrier.

Dag Lem dag at nimrod.no
Mon Jun 17 14:41:17 PDT 2013


Zhigang Gong <zhigang.gong at linux.intel.com> writes:

> This patch looks good to me. And it can pass the global memory barrier
> case. Thanks for the patch, I will push it latter.
> As to the local memory fence, according to the bspec, we don't need to
> issue a fence to ensure the memory access ordering. I heard from you
> that you tried to modify the test case to use local memory and local fence,
> it also had problem. Could you submit your modification as a new test
> case for local memory barrier? Then we can all take a look at that case
> and investigate the underlying problem.
>

I suspect that any problems with local memory may have a different
cause.

As far as I understand, SLM (shared local memory) can only be allocated
per half-slice (i.e. 8 EUs for IVB GT2).

In OpenCL, on the other hand, local memory is allocated per work group.

This implies that Beignet can either

a) Always make a work group correspond to a half-slice (inflexible).
b) Never run more than one work group (<= half-slice) at once (slow).
c) Subdivide local memory per work group (<= half-slice) (good).

However I suspect that Beignet does none of these, but rather lets all
work groups share the same local memory. This will lead to different
work groups stomping on each others' supposedly local memory.

Please apologize if I should be talking nonsense here; my understanding
of these issues is quite limited :-)

-- 
Dag


More information about the Beignet mailing list