D-Bus upgrade problems in Debian

Havoc Pennington hp at pobox.com
Sat Aug 30 08:01:43 PDT 2008


Hi,

On Sat, Aug 30, 2008 at 7:22 AM, Matthew Johnson <dbus at matthew.ath.cx> wrote:
> There is a lack of official D-Bus
> policy on this and whether we every plan to support it. I suspect that
> for the release after this there will be a big push from Debian to make
> all D-Bus-using applications survive a bus restart

I don't know if we have an official dbus cabal, but speaking for
myself, libdbus already allows apps to survive restart if they choose
to do so (and write code to do so). However, it is Hard(tm) to
implement correctly in each app, and this will always be a fairly
untested codepath, so in fact each app does not handle dbus restart.
That was expected, and it's why the recommendation has always been to
not restart, because restarting is not practical, no matter how much
people may feel it is "right" to restart. Sometimes the right thing
must bow to reality.

If you have a situation with a large number N of rarely-tested and
complex codepaths maintained in multiple packages by different people,
then reality is that at any given moment in the lifetime of a Linux
distribution one of those codepaths will not work. Restarting dbus,
however, _requires_ that _all_ of them work _always_. It requires
feeling confident that this particular codepath is rarely if ever
broken in any package. Unless you are regularly and in an automated
way testing all packages for this behavior, I think you would be
insane to count on this.

If I wanted restart to work, here is what I would do:

1) I would strongly recommend that you first create an API that makes
the restart case use exactly the same codepath in the app as the first
startup case. This basically means apps getting an "on connected"
callback, then an "on bus name appeared" or "on bus name obtained"
callback, and so forth. This will solve the problem that a complex and
difficult restart codepath will never get testing, by avoiding a
special rarely-used codepath for restart.

Behind the scenes, on losing system bus connection, this API would
start trying to reconnect in some type of timeout.

2) set up some kind of automated tests or tinderbox; install all
packages that connect to system bus, start them up, restart system
bus, then verify the functionality of all the stuff that was
previously connected. Keep running these tests forever, as long as you
support restart.

3) port apps to the API created in 1), which should slowly make the
tests created in 2) start to pass.

This 1/2/3 is the _easiest_ way to approach restart that I can come up
with. But it is obviously a ton of work.

I would question that it's worth investing so much effort, or that
anyone has time to invest so much effort. But I've been wrong before.

> or to make D-Bus able
> to seamlessly restart without dropping connections.

If you can figure out any sane way to do this, I will be impressed ;-)
You have to exec a new dbus-daemon replacing the address space while
transferring *all* state to the new process, including if the new
process and old process are arbitrarily different versions with
extensive changes to the in-memory data structures. This enormously
complex codepath will again be essentially never tested. So first
criteria for this would be to come up with some way to have extremely
thorough unit test coverage that proves it works. But, to be honest, I
think this approach is more or less impossible - yes it could be done
if you had a huge amount of time and resources to put into it
indefinitely, but, nobody does. Surely even if you got it sort of
working for a couple sample versions of dbus-daemon, it would be an
unmaintainable hack that would break in future versions or for certain
version combinations.

Anyway, of the options I think my 1/2/3 suggestion above is the most
feasible. It's not something you're going to be able to do in
Debian-specific patches. It's something where you're going to need to
create a really solid API and get it in upstream and then laboriously
port all apps and daemons to it. It will take a couple of years most
likely.

> Secondly, we have a technical problem implementing the obvious solution
> (don't restart it on upgrades) in that the currently installed package
> in stable has a prerm script which stops D-Bus (using the normal
> initscript template for packages gives you this) and this is always run
> first during any package upgrade, so it is not possible for the new
> package to prevent this from happening. Do you have any suggestions as
> to how we can (given this exists) provide a relatively smooth upgrade
> from the current release to the next one which we are preparing at the
> moment.

All I can think of is to stop and restart everything that could be
connected to dbus, in addition to dbus itself. But it's not a good
situation to be in obviously.

Havoc


More information about the dbus mailing list