deadlock protection
Havoc Pennington
hp at redhat.com
Mon Oct 18 13:28:03 PDT 2004
Hi,
Just doing some initial half-baked thinking on deadlock protection.
For reference:
http://lists.kde.org/?t=109701199100001&r=1&w=2
Let me naively map this to D-BUS, and then describe some possible
problems with the naive approach. I could use some help figuring out
details of how to do this.
The D-BUS changes could be:
1 for each outgoing method call, have a call stack ID that will
identify the call stack that method call initiates or is part of
2 have dbus_message_set_call_stack (message, id) for setting the ID
3 some sort of API to encourage or automate setting it; this could
be that any outgoing messages queued while the current incoming
message is being processed would have the ID of the incoming
message, and outgoing messages with no incoming call on the stack
would have a new ID.
4 this automated mechanism would not work if people do async stuff
based on either main loop or threads, but they
could still manually propagate the ID
5 bindings could force setting the ID by having a single-threaded
never-any-async API
6 if an incoming method call has the call stack ID of a reply-pending
outgoing call, jump the incoming call to the front of the message
queue and dispatch the main loop until the incoming call is popped
off the queue
Issues:
- reordering of messages. In the above proposal, I suggested for
consistency that the reordering happens anytime we have a pending
reply, not only when we are *blocking* for said reply. I'm having
trouble thinking of a good example of when reordering will cause
problems, but I'm not comfortable I've fully thought through
when it will happen and what the consequences will be.
- as mentioned in Waldo's post, we are not fixing all deadlocks;
only a certain common type of deadlock. Anytime an app blocks,
a deadlock is possible. Really the only way to avoid this is
to write everything async.
- because dbus allows multithreaded and main-loop-based async
calls, we can't fully/reliably automate tracking the call stack ID;
some app developers may screw it up
- in the above, item 6 has "dispatch the main loop" - but libdbus
currently has no way to do that, the main loop is only a concept
higher up in the bindings
- in item 3, "while the current incoming message is being processed"
isn't a concept we have right now ... messages are just popped
off the queue and never put back, there's no "end" of the
processing, other than perhaps the message getting unref'd.
In libdbus that is, the bindings may have a "current message"
concept.
A couple thoughts on alternate approaches, not sure these are going to
be useful, but noting them:
- we could punt most of this to the bindings; i.e. introduce
call stack ID to the protocol and libdbus, but require bindings
to figure out how to conveniently track and propagate it.
Disadvantage of course is that the app you're talking to may
lose track of the call stack if its binding doesn't support it.
- rather than jumping the would-deadlock incoming method call to the
front of the queue, we could return an error "EWOULDDEADLOCK" sort
of thing. the advantage is not having to worry about semantics
of message reordering, or how to invoke the main loop.
Deadlocks would still need debugging (same as if they in fact
deadlocked), but they would not lock up the apps which
would be nice for users.
This potentially solves more deadlock cases, however,
in that we could have an app mark "will block for reply"
on outgoing calls, and then the bus can know when a
client is blocking and which app it is blocking on.
So in the "apps send a call to each other simultaneously"
case we could detect the deadlock and return an error.
Maybe worth doing anyway.
Anyhow, lots of details here.
Havoc
More information about the dbus
mailing list