DBus reconnect support (initial attempt)
hp at redhat.com
Tue Mar 15 07:21:36 PST 2005
On Tue, 2005-03-15 at 08:01 -0500, Bill Rugolsky Jr. wrote:
> If it is some huge bloated behemoth that may die due to bugs, or gets
> killed by the OOM-killer, etc., then apps must reliably reconnect. If
> it _can_ crash, it _will_ crash, quite often for certain workloads.
Well, there's no way to avoid "it can crash" - but it has a lot of unit
tests and does handle OOM. Of course, my understanding of the OOM killer
is that it can kill anything it feels like, so we could get nuked by it.
In a rational world we'd modify the kernel to avoid this, but the world
according to kernel is not always rational ;-)
> If it is compact, simple, and never crashes, then a less complex model
> fault model may be appropriate. But, at a minimum, I'd expect the
> system dbus to be able to checkpoint its state and rexec itself, with
> file-descriptors held open across exec() if necessary. It certainly
> wouldn't be the first daemon with such requirements.
My experience implementing this with gconfd is that it adds a ton of
complexity in and of itself...
More information about the dbus