deadlock protection
Waldo Bastian
bastian at kde.org
Wed Oct 20 03:47:06 PDT 2004
On Monday 18 October 2004 22:28, Havoc Pennington wrote:
> - in item 3, "while the current incoming message is being processed"
> isn't a concept we have right now ... messages are just popped
> off the queue and never put back, there's no "end" of the
> processing, other than perhaps the message getting unref'd.
> In libdbus that is, the bindings may have a "current message"
> concept.
But you do know when a message needs a reply I assume? So when that reply gets
send, that's the end of the processing. If the message doesn't need a reply,
the call stack ID is not needed either, because in that case you can never
deadlock.
You still don't have captured the concept of "current message" with that, only
"all messages that still need processing", so it doesn't help you too much.
So I guess it's indeed mainly a task for the bindings. If there is a way to
say "this thread is currently processing this message" then bindings that map
to actual (synchronous) function calls can automatically call that, in the
other case the application will need to take care of it. But see below.
> A couple thoughts on alternate approaches, not sure these are going to
> be useful, but noting them:
>
> - we could punt most of this to the bindings; i.e. introduce
> call stack ID to the protocol and libdbus, but require bindings
> to figure out how to conveniently track and propagate it.
> Disadvantage of course is that the app you're talking to may
> lose track of the call stack if its binding doesn't support it.
>
> - rather than jumping the would-deadlock incoming method call to the
> front of the queue, we could return an error "EWOULDDEADLOCK" sort
> of thing. the advantage is not having to worry about semantics
> of message reordering, or how to invoke the main loop.
> Deadlocks would still need debugging (same as if they in fact
> deadlocked), but they would not lock up the apps which
> would be nice for users.
>
> This potentially solves more deadlock cases,
> however,
> in that we could have an app mark "will block for reply"
> on outgoing calls, and then the bus can know when a
> client is blocking and which app it is blocking on.
> So in the "apps send a call to each other simultaneously"
> case we could detect the deadlock and return an error.
> Maybe worth doing anyway.
I don't see how it could solve _more_ deadlock cases, in particular, I don't
see how it would detect "apps send a call to each other simultaneously".
I tried to add something like that to DCOP over the weekend, but then realized
that
a) It will never be able to catch complicated cases (if the call sequence
A->B->C is started in parallel with C->D->A it's impossible to detect the
deadlock, because the call that arrives at C could theoretically contain
information about A and B, but C doesn't know that D has called A as a result
of C calling D)
b) In the simple case (A->B started in parallel with B->A) it is possible to
detect that e.g. A is waiting for a response from B when A gets the call from
B, but deciding on that information alone creates the risk of false
positives. After all, maybe B still handles incoming calls either in a
separate thread or in the same thread while waiting for the answer of A, so
there may not be deadlock at all. It's possible of course to indicate in the
message send by A that A is still able to process incoming messages, but that
starts to become a bit hairy.
Overall I'm starting to become less and less impressed with the automatic
deadlock detection in DCOP. In particular because it detects and handles the
simple cases, but those are exactly the cases that you would notice right
away anyway because they would deadlock each and every time otherwise.
The hard cases (A calls B in parallel with B calling A, KDE BR69346) aren't
detected, and that are exactly the cases that would be most valuable because
that are the ones that are timing dependant, so they are most likely to be
missed while developing and thanks to Murphy only show up after release in
builds without debug information ;-)
(In my mail from 2004-10-03, "Reentrancy (Was: RFC: DBUS & KDE 4)" I wasn't
aware that DCOP failed to handle exactly those cases that would have been the
most valuable to handle)
Then there is also the risks associated with unwanted/unexpected recursion. In
the trivial cases the automatic deadlock detection replaces a deadlock with
recursion, but the developer will hardly know because it happens behind
his/her back.
So at this point in time, I'm starting to think that IMHO the right solution
is to give application developers more control over the recursion behavior
(which is a responsibility of the bindings). So far I have thought about that
in terms of some sort of flag each time an outgoing call is made, that
indicates whether incoming calls should be processed during this time.
KDE BR69346 made me realize that another option is to flag methods in the call
interface as "recursion safe", e.g. calls that only query some attribute
value can most likely be processed while an outgoing call is in progress
since it doesn't have side effects anyway. That would basically be the
equivalent of C++'s const on methods. (This would all be up to the bindings,
I don't think it needs any particular information from libdbus)
As for DCOP compatibility, given the IMHO limited practical use of automatic
deadlock protection and the more flexible design of DBUS wrt async message
handling and thread support, I'm inclined to think that it would be enough if
DBUS provides for a way to let the binding that provides DCOP compatibility
map DCOP's automatic deadlock protection onto DBUS but that other bindings
wouldn't need to bother with this (otoh, if a binding likes to it could
support it, and use it for limited deadlock detection). The goal is then to
provide backwards compatibility for existing DCOP applications, but new
(DBUS) applications should then rely on the recursion control features of
their bindings to prevent deadlock and not on automatic deadlock detection.
Cheers,
Waldo
--
bastian at kde.org | SUSE LINUX 9.2: Order now! | bastian at suse.com
http://www.suse.de/us/private/products/suse_linux/preview/index.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://freedesktop.org/pipermail/dbus/attachments/20041020/3c626198/attachment.pgp
More information about the dbus
mailing list