[Mesa-dev] Leaked hardware event if kernel launch fails?
Pierre Moreau
pierre.morrow at free.fr
Sun Dec 25 20:27:58 UTC 2016
Hello,
I noticed that, if trying to enqueue a kernel which had no
`module::section::text_executable` attached to its clover module, I would get a
`std::out_of_range` exception, instead of the expected
CL_INVALID_PROGRAM_EXECUTABLE (see [0]; I tried enqueueing using
`clEnqueueNDRangeKernel). I modified the `kernel::exec_context::bind()` method
to catch the out-of-range exceptions, and throw the proper clover exception
instead:
```
diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp
index 328323b6b0..1bb4f612cb 100644
--- a/src/gallium/state_trackers/clover/core/kernel.cpp
+++ b/src/gallium/state_trackers/clover/core/kernel.cpp
@@ -161,8 +161,18 @@ kernel::exec_context::bind(intrusive_ptr<command_queue> _q,
// Bind kernel arguments.
auto &m = kern.program().build(q->device()).binary;
- auto margs = find(name_equals(kern.name()), m.syms).args;
- auto msec = find(type_equals(module::section::text_executable), m.secs);
+ std::vector<module::argument> margs;
+ try {
+ margs = find(name_equals(kern.name()), m.syms).args;
+ } catch (const std::out_of_range &e) {
+ throw error(CL_INVALID_KERNEL);
+ }
+ module::section msec;
+ try {
+ msec = find(type_equals(module::section::text_executable), m.secs);
+ } catch (const std::out_of_range &e) {
+ throw error(CL_INVALID_PROGRAM_EXECUTABLE);
+ }
auto explicit_arg = kern._args.begin();
for (auto &marg : margs) {
```
But now, when my OpenCL program exists after the error, the destruction of the
`cl::CommandQueue` object doesn’t happen in a peaceful manner:
```
Program received signal SIGSEGV, Segmentation fault.
0x0000000000652e40 in ?? ()
(gdb) bt
#0 0x0000000000652e40 in ?? ()
#1 0x00007ffff7b2d29a in clover::command_queue::flush (this=this at entry=0x653310) at ../../../../../mesa_spirv/src/gallium/state_trackers/clover/core/queue.cpp:77
#2 0x00007ffff7b0ebc0 in clReleaseCommandQueue (d_q=0x653318) at ../../../../../mesa_spirv/src/gallium/state_trackers/clover/api/queue.cpp:63
#3 0x0000000000405125 in cl::detail::ReferenceHandler<_cl_command_queue*>::release (queue=0x653318) at /usr/include/CL/cl.hpp:1686
#4 0x0000000000405108 in cl::detail::Wrapper<_cl_command_queue*>::release (this=0x7fffffffda98) at /usr/include/CL/cl.hpp:1863
#5 0x00000000004050c7 in cl::detail::Wrapper<_cl_command_queue*>::~Wrapper (this=0x7fffffffda98) at /usr/include/CL/cl.hpp:1802
#6 0x00000000004047e5 in cl::CommandQueue::~CommandQueue (this=0x7fffffffda98) at /usr/include/CL/cl.hpp:5482
#7 0x00000000004046f8 in main () at instruction-set_OpenCL-std.cpp:58
(gdb) up
#1 0x00007ffff7b2d29a in clover::command_queue::flush (this=this at entry=0x653310) at ../../../../../mesa_spirv/src/gallium/state_trackers/clover/core/queue.cpp:77
77 queued_events.front()().fence(fence);
```
(I am using the OpenCL-C++ binding, and Mesa is quite recent (88b5acfa09) with
custom patches to get some OpenCL support for Nouveau.)
Looking around a bit with the debugger, it seems like the event created by the
`clEnqueueNDRangeKernel()` function still exists within the command queue:
shouldn’t it have been automatically removed from the queue as the enqueue
function failed?
Thank you for your help!
Pierre
[0]: https://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clEnqueueNDRangeKernel.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20161225/b9851f0b/attachment.sig>
More information about the mesa-dev
mailing list