[Beignet] clEnqueueNDRangeKernel and kernel completion
Zhigang Gong
zhigang.gong at linux.intel.com
Wed Jun 5 19:24:03 PDT 2013
Hi Edward,
From: beignet-bounces+zhigang.gong=linux.intel.com at lists.freedesktop.org
[mailto:beignet-bounces+zhigang.gong=linux.intel.com at lists.freedesktop.org]
On Behalf Of Edward Ching
Sent: Thursday, June 06, 2013 8:03 AM
To: beignet at lists.freedesktop.org
Subject: [Beignet] clEnqueueNDRangeKernel and kernel completion
I hope this is the right forum to post comments/questions on Beignet OpenCL
API behaviour. If not, please ignore and excuse the disruption.
[Gong, Zhigang] You are right, this is the only official place to discuss
anything about Beignet project. You are welcome to post related topics here.
I'm running Beignet on an Ivy Bridge machine and noticed that clFinish would
return before the GPU has complete processing of previously submitted
commands.
e.g:
I submitted an OpenCL kernel via clEnqueueNDRangeKernel, followed by
clFinish, and expected the GPU to have finished all processing when clFinish
returns.
But clFinish returned right away, and when I then call clEnqueueMapBuffer to
access data, the call blocks, so I traced the logic all the way into the Ivy
Bridge GPU device driver (~/drivers/gpu/drm/i915/*), and it looks like every
IvyBr GPU batchbuffer used to submit an OpenCL kernel has an associated
sequence number which the GPU would write to a special location which has to
be monitored in order to tell if the GPU has finished processing the
submtted kernel (details in Intel HD graphics PRM and i915/GEM design
notes). Neither clFinish nor clEnqueueNDRangeKernel does this monitoring,
and clEnqueueMapBuffer happened to do it when it tried to transfer the
buffer objects from GPU to CPU domain.
Does this make sense? It seems to me that clFinish should instead be
monitoring and blocking if the GPU is still busy executing an OpenCL kernel.
[Gong, Zhigang] Your finding is correct, current clFinish does nothing which
is not comply with the spec. It is on our TODO list. Actually, we have more
related TODO items. Currently, the clEnqueueNDRangeKernel flushes the
batchbuffer every time, and thus the clFlush is also an empty function. We
also need to optimize it to track the states and avoid some unnecessary pipe
controls for each kernel. But our team's current focus is to implement the
missing opencl features, and try hard to pass the piglit test. After that,
we will turn to these items. And as usual, if everyone from the community
want to contribute on these items, we will be more than happy to review and
accept it.
/Ed
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/beignet/attachments/20130606/4a6f221a/attachment.html>
More information about the Beignet
mailing list