[Mesa-dev] EXTERNAL: Re: Clover clEnqueue* function don't implement blocking?

Francisco Jerez currojerez at riseup.net
Tue Apr 15 05:12:25 PDT 2014


"Dorrington, Albert" <albert.dorrington at lmco.com> writes:

>> -----Original Message-----
>> From: Francisco Jerez [mailto:currojerez at riseup.net]
>  > "Dorrington, Albert" <albert.dorrington at lmco.com> writes:
>> >
>> > From reading the OpenCL spec (and perhaps I'm misinterpreting something
>> again), section 5.10 Flush and Finish says:
>> >
>> > 	Any blocking commands queued in a command-queue such as
>> > 	clEnqueueRead{Image|Buffer} with blocking_read set to CL_TRUE,
>> > 	clEnqueueWrite{Image|Buffer} with blocking_write set to CL_TRUE,
>> > 	clEnqueueMap{Buffer|Image} with blocking_map set to CL_TRUE or
>> > 	clWaitForEvents perform an implicit flush of the command-queue.
>> >
>> > From this statement, I would expect that the command-queue would be
>> flushed when the blocking flag is set.
>> 
>> clEnqueueRead*, clEnqueueMap* and clWaitForEvents already flush the
>> command queue (the first two are flushing indirectly as we try to map a
>> buffer referenced by the GPU).  clEnqueueWrite* doesn't flush, but it's not
>> clear to me that not doing it can be considered a violation of the spec.  The
>> guarantees given by clFlush() are rather vague (to some extent an empty
>> function could be a valid implementation) and it seems to me that a
>> compliant implementation might, for instance, choose to batch up
>> commands across flushes if that's the most efficient thing to do, as long as
>> the user has no way to tell the difference.
>> 
>> I'd like to see some real-world example where clover's behavior represents a
>> problem before we change it to flush more frequently, because I'm worried
>> that changing this will actually worsen performance rather than improving it.
>
> I have been working with a modified version of Mesa code, which
> accepts kernels compiled with AMD's compiler.  (Our project's goal is
> to host Mesa in an environment which does not currently support
> LLVM/Clang or C++11)
>
> While testing 2D image read capabilities, I have been encountering an
> issue where the command queue's 'queued_events' continues to be
> populated, with none of the events being removed until the clFinish
> call. At that point, I have 23,328 events in the queue and encounter a
> segmentation fault during the command_queue flush.
>
It would be interesting to find out what's causing the segfault exactly,
the more frequent flushes might just be hiding a problem of different
nature.  Also, is it the expected behavior of your test to queue so many
events before trying to read back the results or doing some other sort
of blocking operation?  Is its source code public?

> After seeing the statement in the OpenCL spec about the implicit flush
> during the clEnqueue calls, I added the previously mentioned
> conditional hev().wait() calls to initiate a flush.  This seems to
> have resolved the issue with the segFaults during the clFinish call;
> although I'll admit it likely isn't the most efficient method.
>
If your problem turns out to be that you're running out of memory, I
think it would be preferable to add conditional calls to
command_queue::flush() from the blocking clEnqueue* commands instead.
Or even better, have command_queue::sequence() flush implicitly after
the number of queued events exceeds certain threshold.

> While I have not benchmarked the runtimes precisely, the run-time did
> not seem to be significantly impacted. The test ran for ~20 minutes
> before crashing, and now runs for ~20 minutes before completing
> successfully.

We should probably double-check that it doesn't impact the run-time of
other well-behaving applications negatively.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 229 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140415/55ade5bd/attachment.sig>


More information about the mesa-dev mailing list