Crash in _dbus_connection_block_pending_call

Tue Jun 13 19:43:45 PDT 2006

John (J5) Palmieri wrote:
>     http://nouse.net/misc/nm-trace-verbose-O0-ggdb3-{a,b}
> 
> Ok, so I think I see whats happening now.  The pending call is most
> likely finished in another thread by the time it reaches
> check_for_reply_and_update_dispatch_unlocked.  The connection itself is
> locked but not the pending call.  I think checking for connection ==
> NULL and returning if it is will fix the issue.  Then again it might
> cause messages to be dropped.  I'm not totally sure.
> 

Checking connection == NULL doesn't fix the race, just makes it shorter.

Looking at the code, in the TODO where it says "API and thread safety 
issues around pending call" it should probably say "pending call majorly 
unthreadsafe" I guess.

I think the original idea was that you could only use each pending call 
from a single thread, but that falls apart with connection itself using 
pending calls internally.

It looks to me like each pending call needs its own lock.

Maybe with some cleverness the connection lock could cover all 
outstanding PendingCall for that connection, avoiding a per-pending-call 
lock, but this only works if pending->connection is never set to NULL. 
Which would presumably mean the pending call holds a refcount on the 
connection object.

If I were adding a per-pending-call lock I'd probably move the 
DBusPendingCall struct into dbus-pending-call.c and add accessors so 
that dbus-connection.c doesn't access the struct directly, this would 
make it easier to verify the locking's correctness.

The per-call lock probably would show up in valgrind-type profiles as a 
few percentage hit, though it's probably lost in the noise on a sampling 
profile or real-world-app profile.

Havoc