Threading problems in (Win)DBus
Peter Kümmel
syntheticpp at gmx.net
Sun Jul 8 14:10:18 PDT 2007
Havoc Pennington wrote:
> Hi,
>
> From the traces, here two different threads take the lock at the same
> time (apparently?):
>
> 10357: LOCK: _dbus_connection_acquire_dispatch [3082640272]
> 10357: 10357: lock socket_do_iteration post poll
> 10357: LOCK: _dbus_connection_lock [3082643152]
>
> And then here also:
>
> 11703: LOCK: dbus_connection_send_with_reply [3082446544]
> 11703: Allocated slot 0 on allocator 0xb7eb583c total 1 slots allocated
> 1 used
> 11703: LOCK: dbus_connection_get_is_connected [3082443664]
>
> So that should not be possible, right? Unless the implementation of
> locking is messed up somehow (which it may well be) or the locks are
> no-ops since threads are for some reason not enabled.
>
> Perhaps a few unit tests in dbus-sysdeps-pthread.c to be sure the
> mutexes work properly would not be a bad idea to add, they'd confirm or
> unconfirm this theory and prevent similar bugs in the future.
>
> Some background info below on how the locks are supposed to work, to aid
> in debugging -
>
> Christian Grigis wrote:
>> Another problem that I noticed: for the pthread implementation of
>> threads in dbus-sysdeps-pthread.c, the _dbus_pthread_mutex_lock()
>> function appears to simulate a reentrant lock by using the pmutex->count
>> counter. Wouldn't the CONNECTION_LOCK macro then fail in case a lock is
>> acquired twice by the same thread (in TOOK_LOCK_CHECK)?
>>
>> Are the two problems related?
>>
>
> My memory is that while all the locks support recursion, only the
> dispatch_mutex is in fact supposed to be used recursively. That's why we
> left have_connection_lock as a boolean rather than a counter.
>
> The code that uses connection->mutex is instead supposed to avoid
> holding that lock when it calls back out to user code. The reasons for
> this are:
> - historical, the lock didn't used to be recursive
> - back compat, nonrecursive locks can still be used with
> the "set thread functions" api and this should only
> mean that recursive dispatch is disallowed (you can't
> connection_dispatch() from inside a message handler)
> - sanity, recursion in a single thread can cause problems
> as well
>
> Anyway, so if we saw the same thread triggering this assertion the
> correct fix is probably to avoid recursing on connection->mutex which
> could involved rearranging code a good bit.
>
> But if I understood your verbose logs correctly, we have two different
> threads both owning the lock in this case.
>
> If it's two different threads getting the same lock, the solution
> probably is not very complicated; it's just a matter of figuring out why
> locks aren't enabled or why they don't work. Just stopping in a debugger
> on the assertion and examining the fields in the lock struct for example
> might reveal what's wrong.
>
> Havoc
Seems no mutexes are used which is possible:
void
_dbus_mutex_lock (DBusMutex *mutex)
{
if (mutex)
{
if (thread_functions.recursive_mutex_lock)
(* thread_functions.recursive_mutex_lock) (mutex);
else if (thread_functions.mutex_lock)
(* thread_functions.mutex_lock) (mutex);
}
}
Does the test code also work with attached patch which checks
if the mutex is NULL.
After applying the patch also echo-server needs to initialize
the threading, DBus::_init_threading();
--
Peter Kümmel
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: connection_mutex.patch
Url: http://lists.freedesktop.org/archives/dbus/attachments/20070708/77615b2e/attachment.txt
More information about the dbus
mailing list