2 race conditions in dbus-glib in multi-thread environment.

Sun Jun 28 01:23:55 PDT 2009

Hi, Everyone:

I recently played with dbus-glib in multi-thread environment and found following 2 race conditions in dbus-glib. I didn't find them in dbus mailing list archives so put here for you review and hope get some suggestions to solve them.
I'm new in dbus, glib and dbus-glib programming so please correct me in case any misunderstanding. Thanks!

1. See code at dbus-gproxy.c:dbus_g_proxy_end_call_internal:
  pending = g_hash_table_lookup (priv->pending_calls, GUINT_TO_POINTER (call_id));
  dbus_pending_call_block (pending);
The pending returned can be NULL and the function dbus_pending_call_block will trigger assert and then abort the program.

The race condition exists at below scenario:
a. Thread A create a proxy through dbus_g_proxy_new_for_name and create a pending call "GetNameOwner". That's done in dbus-glib internal.
b. Then Thread A uses the dbus_g_proxy_call and finishes a call. When it unref the proxy, it will call dbus_g_proxy_cancel_call to remove the "GetNameOwner" pending call.
c. But before the Thread A really cancel the call, Thread B, which has a mainloop has received the reply for the "GetNameOwner" call and then call got_name_owner_cb. The got_name_owner_cb calls dbus_g_proxy_end_call_internal and the pending can not be found. Then the code above will trigger assert(pending!=NULL) in dbus_pending_call_block;

I did hack to return FALSE when pending returned is NULL could bypass that bug. But I don't what negative impacts will be.

2. See code at dbus-proxy.c:dbus_g_proxy_manager_get
  manager = dbus_connection_get_data (connection, g_proxy_manager_slot);
  if (manager != NULL)
    {
      dbus_connection_free_data_slot (&g_proxy_manager_slot);
      dbus_g_proxy_manager_ref (manager);
      g_static_mutex_unlock (&connection_g_proxy_lock);
      return manager;
    }
The manager returned from dbus_connection_get_data doesn't have manager->refcount increased before it calls dbus_g_proxy_manager_ref. Between these two calls, other thread is able to decrease manager->refcount to 0 and free the manager. Then the manager returned becomes an invalid pointer.

The race condition exists at below scenario:
a. Thread A wants to create a proxy, it will calls dbus_g_proxy_manager_register to register the proxy to the proxy manager. My understanding is there is only one proxy manager instance across all threads in the same process. Thread A will call dbus_g_proxy_manager_ref (in dbus_g_proxy_manager_get) to increase the reference count to the manager.
b. In function dbus_g_proxy_manager_ref, it will wait for manager->lock. But the manager->lock is hold by Thread B and manager->refcount is 1 now so the reference count will not be increased till Thread B release the lock.
c. Thread B just finished all calls and then call dbus_g_proxy_manager_dispose, which will call dbus_g_proxy_manager_unref. Then the manager->refcount becomes 0. Then the Thread B calls dbus_g_proxy_manager_unref and directly frees the manager.
d. After Thread B release the manager->lock. Thread A was scheduled but the manager pointer is an invalid pointer.

I have not found good way to solve that problem. Personally I think probably we need some big changes.

Thanks
Jackie (Weihua) Wu
Intel Opensource Technology Center
Intel China Research Center
(inet)8-758-1963
(o) 86-10-82171963

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freedesktop.org/archives/dbus/attachments/20090628/1c8af2e4/attachment.htm