python-dbus with threads fails silently

Brian Beggs bbeggs at enernoc.com
Fri Feb 4 11:21:25 PST 2011


I am working with a software program that uses a plugin format.  I am using the dbus to send the heartbeats from each plugin to notify a supervisor thread that the plugin is still running correctly.  Should a plugin cease to function properly the supervisor attempts to exit the plugin and restart it.  

This is where we start to run into problems.  Typically once the plugin restarts everything will operate fine for a short while, then suddenly the entire process will exit without any indication of why it quit.  I've read back through some old items on the mailing list indicating that others had seen a race condition in the python dbus code.  I have also been able to replicate and catch the SIGSEGV a few times on my development machine.  Below I have included the partial report of the SIGSEGV I was able to catch on my macbook, as well as a test program that will replicate the issue I have seen.  I am interested in getting any input about if there is something different we could do on our end to work around this issue, if this is a defect in the dbus libraries or if we are using dbus in a way that it was not intended to be used.

Thanks,

Brian

Crash report 1:
Process:         Python [168]
Path:            /System/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python
Identifier:      Python
Version:         ??? (???)
Code Type:       X86-64 (Native)
Parent Process:  eclipse [41948]

PlugIn Path:       /opt/local/lib/libdbus-1.3.dylib
PlugIn Identifier: libdbus-1.3.dylib
PlugIn Version:    8.0.0 (compatibility 8.0.0)

Date/Time:       2011-02-03 16:47:20.840 -0500
OS Version:      Mac OS X 10.6.6 (10J567)
Report Version:  6

Interval Since Last Report:          201882 sec
Crashes Since Last Report:           5
Per-App Crashes Since Last Report:   5
Anonymous UUID:                      EDA89B4A-5C64-4BB6-98E8-7AFF2420992D

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: 0x000000000000000d, 0x0000000000000000
Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
0   libdbus-1.3.dylib             	0x00000001016d133b _dbus_pthread_mutex_lock + 27
1   libdbus-1.3.dylib             	0x00000001016b1bd5 _dbus_connection_lock + 21
2   libdbus-1.3.dylib             	0x00000001016c3965 _dbus_pending_call_get_connection_and_lock + 21
3   libdbus-1.3.dylib             	0x00000001016b61af reply_handler_timeout + 15
4   libdbus-glib-1.2.dylib        	0x00000001016fa38d timeout_handler_dispatch + 13
5   libglib-2.0.0.dylib           	0x000000010202668d g_timeout_dispatch + 29
6   libglib-2.0.0.dylib           	0x0000000102025f19 g_main_context_dispatch + 553
7   libglib-2.0.0.dylib           	0x00000001020294f1 g_main_context_iterate + 961
8   libglib-2.0.0.dylib           	0x0000000102029805 g_main_loop_run + 485
9   _glib.so                      	0x0000000101772752 _wrap_g_main_loop_run + 114
10  org.python.python             	0x0000000100088f3f PyEval_EvalFrameEx + 14733
11  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
12  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
13  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
14  org.python.python             	0x000000010008ad61 PyEval_EvalCode + 54
15  org.python.python             	0x00000001000a265a Py_CompileString + 78
16  org.python.python             	0x00000001000a2723 PyRun_FileExFlags + 150
17  org.python.python             	0x0000000100083196 _PyBuiltin_Init + 12163
18  org.python.python             	0x0000000100089187 PyEval_EvalFrameEx + 15317
19  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
20  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
21  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
22  org.python.python             	0x000000010008ad61 PyEval_EvalCode + 54
23  org.python.python             	0x00000001000a265a Py_CompileString + 78
24  org.python.python             	0x00000001000a2723 PyRun_FileExFlags + 150
25  org.python.python             	0x00000001000a423d PyRun_SimpleFileExFlags + 704
26  org.python.python             	0x00000001000b0286 Py_Main + 2718
27  org.python.python.app         	0x0000000100000e6c start + 52

Crash report 2:
Process:         Python [99496]
Path:            /System/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python
Identifier:      Python
Version:         ??? (???)
Code Type:       X86-64 (Native)
Parent Process:  eclipse [41948]

PlugIn Path:       /opt/local/lib/libdbus-1.3.dylib
PlugIn Identifier: libdbus-1.3.dylib
PlugIn Version:    8.0.0 (compatibility 8.0.0)

Date/Time:       2011-02-03 15:52:27.787 -0500
OS Version:      Mac OS X 10.6.6 (10J567)
Report Version:  6

Interval Since Last Report:          198589 sec
Crashes Since Last Report:           4
Per-App Crashes Since Last Report:   4
Anonymous UUID:                      EDA89B4A-5C64-4BB6-98E8-7AFF2420992D

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: 0x000000000000000d, 0x0000000000000000
Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
0   libdbus-1.3.dylib             	0x00000001015a308e link_before + 14
1   libdbus-1.3.dylib             	0x00000001015a3131 _dbus_list_append_link + 17
2   libdbus-1.3.dylib             	0x000000010158b035 _dbus_connection_queue_synthesized_message_link + 21
3   libdbus-1.3.dylib             	0x000000010159a130 _dbus_pending_call_queue_timeout_error_unlocked + 32
4   libdbus-1.3.dylib             	0x000000010158c1bd reply_handler_timeout + 29
5   libdbus-glib-1.2.dylib        	0x000000010170138d timeout_handler_dispatch + 13
6   libglib-2.0.0.dylib           	0x000000010202668d g_timeout_dispatch + 29
7   libglib-2.0.0.dylib           	0x0000000102025f19 g_main_context_dispatch + 553
8   libglib-2.0.0.dylib           	0x00000001020294f1 g_main_context_iterate + 961
9   libglib-2.0.0.dylib           	0x0000000102029805 g_main_loop_run + 485
10  _glib.so                      	0x0000000101779752 _wrap_g_main_loop_run + 114
11  org.python.python             	0x0000000100088f3f PyEval_EvalFrameEx + 14733
12  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
13  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
14  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
15  org.python.python             	0x000000010008ad61 PyEval_EvalCode + 54
16  org.python.python             	0x00000001000a265a Py_CompileString + 78
17  org.python.python             	0x00000001000a2723 PyRun_FileExFlags + 150
18  org.python.python             	0x0000000100083196 _PyBuiltin_Init + 12163
19  org.python.python             	0x0000000100089187 PyEval_EvalFrameEx + 15317
20  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
21  org.python.python             	0x000000010008935e PyEval_EvalFrameEx + 15788
22  org.python.python             	0x000000010008acce PyEval_EvalCodeEx + 1803
23  org.python.python             	0x000000010008ad61 PyEval_EvalCode + 54
24  org.python.python             	0x00000001000a265a Py_CompileString + 78
25  org.python.python             	0x00000001000a2723 PyRun_FileExFlags + 150
26  org.python.python             	0x00000001000a423d PyRun_SimpleFileExFlags + 704
27  org.python.python             	0x00000001000b0286 Py_Main + 2718
28  org.python.python.app         	0x0000000100000e6c start + 52


Test Program:

import gobject
import dbus
import dbus.service
import dbus.mainloop.glib
import signal
import time
import threading
import sys, traceback

class WorkerA(dbus.service.Object):
    def __init__(self, bus):
        print 'WorkerA starting...'
        self._bus = bus
        self.service_path='/test/worker/a'
        self.service_name='test.worker.a'
        self.name = dbus.service.BusName(self.service_name, self._bus)
        dbus.service.Object.__init__(self, self._bus, self.service_path)
        self.echo('started %s' %self.service_name)
        proxy = self._bus.get_object( 'test.worker.b', '/test/worker/b',
                    follow_name_owner_changes=True )
        self._workerB = dbus.Interface( proxy, 
                dbus_interface = 'test.worker.b' )
        self._exit = threading.Event()
        self._exit.clear()
        self.tr = threading.Thread(target=self.run, name='caller thread')
        self.tr.daemon = True
        self.tr.start()
        
        self.heartbeat_tr = threading.Thread(target=self.heartbeat_thread, name='%s_heartbeat_thread' %self.__class__.__name__)
        self.heartbeat_tr.daemon = True
        self.heartbeat_tr.start()
        
    def run(self):
        time.sleep(5)
        while not self._exit.isSet():
            try:
                response = self._workerB.remoteMethod('input string', 1)
                print response
            except:
                traceback.print_exc()
            self._exit.wait(2)
            
    def heartbeat_thread(self):
        while not self._exit.isSet():
            self.echo('heartbeat from %s' %self.__class__.__name__)
            self._exit.wait(5)
    
    def exit(self):
        self._exit.set()
    
    @dbus.service.signal(dbus_interface='test.worker.a', signature='s')
    def echo(self, name):
        pass
    
    @dbus.service.method(dbus_interface='test.worker.a', in_signature='si', out_signature='s')
    def remoteMethod(self, someString, waitTime):
        print 'Got remote method call with string: %s and wait time: %i' %(someString, waitTime)
        #time.sleep(waitTime)
        return self.service_name + ' remote method return string'
  
class WorkerB(dbus.service.Object):
    def __init__(self, bus):
        print 'WorkerB starting...'
        self._bus = bus
        self.service_path='/test/worker/b'
        self.service_name='test.worker.b'
        self.name = dbus.service.BusName(self.service_name, self._bus)
        dbus.service.Object.__init__(self, self._bus, self.service_path)
        self.echo('started %s' %self.service_name)
        self._exit = threading.Event()
        self._exit.clear()
        
        self.heartbeat_tr = threading.Thread(target=self.heartbeat_thread, name='%s_heartbeat_thread' %self.__class__.__name__)
        self.heartbeat_tr.daemon = True
        self.heartbeat_tr.start()
    
    def heartbeat_thread(self):
        while not self._exit.isSet():
            self.echo('heartbeat from %s' %self.__class__.__name__)
            self._exit.wait(5)
    
    def exit(self):
        self._exit.set()
        
    @dbus.service.signal(dbus_interface='test.worker.b', signature='s')
    def echo(self, name):
        pass
    
    @dbus.service.method(dbus_interface='test.worker.b', in_signature='si', out_signature='s')
    def remoteMethod(self, someString, waitTime):
        print 'Got remote method call with string: %s and wait time: %i' %(someString, waitTime)
        #time.sleep(waitTime)
        return self.service_name + ' remote method return string\nInput: %s, wait: %i' %(someString, waitTime)
        
def echoMessage(msg):
    print msg
    

class Supervisor(threading.Thread):
    
    def __init__(self):
        threading.Thread.__init__(self)
        self._exit = threading.Event()
        self._exit.clear()
    
    def run(self):
        self.b = WorkerB(dbus.SessionBus(private=True))
        self.a = WorkerA(dbus.SessionBus(private=True))
        self._exit.wait(30)
        while not self._exit.isSet():
            self.b.exit()
            self.b.remove_from_connection()
            self.a.exit()
            self.a.remove_from_connection()
            print 'workers stopped'
            time.sleep(2)
            print 'starting workers'
            self.b = WorkerB(dbus.SessionBus(private=True))
            self.a = WorkerA(dbus.SessionBus(private=True))
            print 'workers started'
            self._exit.wait(30)
            
    def exit(self):
        self._exit.set()
        self.b.exit()
        self.a.exit()

if __name__ == '__main__':
    print 'starting main...'
    gobject.threads_init()  #required for the service to execute its own thread!
    dbus.mainloop.glib.threads_init()
    dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
    
    bus = dbus.SessionBus(private=True)
    
    
    def killSigHandler(signum, frame):
        main_loop.quit()
        visor.exit()

    signal.signal(signal.SIGINT, killSigHandler)
    signal.signal(signal.SIGHUP, killSigHandler)
    signal.signal(signal.SIGTERM, killSigHandler)
    
    bus.add_signal_receiver(echoMessage, signal_name='echo')
    
    try:
        visor = Supervisor()
        visor.start()
        gobject.MainLoop().run()
    except  (SystemExit, KeyboardInterrupt):
        killSigHandler(-1, None)
        
    print 'exiting...'

This email and any information disclosed in connection herewith, whether written or oral, is the property of EnerNOC, Inc. and is intended only for the person or entity to which it is addressed.  
This email may contain information that is privileged, confidential or otherwise protected from disclosure.  
Distributing or copying any information contained in this email to anyone other than the intended recipient is strictly prohibited.


More information about the dbus mailing list