glib dbus bindings notes

Tue Feb 24 16:27:24 PST 2009

Hi,

I sent an earlier draft of these notes to davidz, but I think would be
useful for other binding authors and for the glib team when looking at
eggdbus, so cleaned it up a bit for this mail. A lot of this is not
specific to glib binding and would apply to any binding.

Some biases I have in mind:

* dbus should be used as much if not more from other languages as it
is from C. Async IPC is _so_ much easier when you have 1) garbage
collection and 2) closures. I think using dbus is reason enough by
itself to have part of an app coded in gjs or seed or something,
personally. Anyway, I think a goal of dbus bindings in glib should be
to support mappings from dbus direct to other languages.

* I'm personally something of a generated-code hater. I know davidz is
a static-typing lover, and I can get into static typing, but I kind of
don't find it worth a ton of generated code. To me one lesson of the
CORBA/Bonobo excursion was that generated code is kinda bloated, kinda
inconvenient to mess with, etc. Anyway, I'm not against people using
statically-generated proxies, but even in C, I tend to like a generic
proxy with varargs convenience functions, sort of like
dbus_g_proxy_call(). Saves makefile hassle in the app, saves code
size, saves a bunch of headaches installing introspection data and idl
files, just simpler all around.

Those biases admitted to, here are my detailed notes. You can
s/glib/other-main-loop-providing-framework/ and it should largely
still apply.

Breaking down a binding into 6 pieces. The first 5 are generic to any
language binding using glib - pygtk, gjs, seed, etc. etc. The last one
(object mapping) is specific to C programs. However, notably, I think
the last one is responsible for 80% or more of the *code* in a glib
binding. I think it's also much more a "matter of taste" where
different programmers like to do the object mapping in different ways.
I would give serious consideration to keeping the object mapping
separate, or at least giving it more time to bake in the real world,
while focusing on getting the first 5 pieces right to start with.

First Piece: Main Loop Integration
===

The main loop integration between DBusConnection and GMainContext is
tricky, tedious, and annoying to get right. This should be in the GLib
binding for D-Bus. The code in dbus-glib for this is pretty much fine,
afaik.

Second Piece: Asynchronous DBusConnection
===

This is mostly just points for style right now, because
dbus_connection_open() does some blocking anyhow. But it's needed if
the API theoretically supports reconnecting to the bus when the bus
restarts, which might be nice. Also, a good way to structure
single-instance apps is to put essentially all of main() in a
callback.

by "put all of main in a callback" I mean something like this:
http://mail.gnome.org/archives/gtk-devel-list/2007-August/msg00039.html

Asynchronous DBusConnection should probably look like a
ConnectionWatcher object that has "connected"/"disconnected" signals,
since that's better for language bindings. However, in C a wrapper
like this is more convenient:

void (* BusConnectedCallback) (DBusConnection*, void *data);
void (* BusDisconnectedCallback(DBusConnection*, void *data);
void add_connection_listeners(BusType bus, BusConnectedCallback,
BusDisconnectedCallback, void *data);
void add_connection_listeners_with_address(const char *address,
BusConnectedCallback, BusDisconnectedCallback, void *data);
// also need a way to remove listeners of course

The lame implementation of this for now just queues a main loop idle,
which calls dbus_connection_open(). In future it could retry
connecting to the bus when the bus restarts. There could also be a
real async way to open a connection sometime.

This API is little-used because usually you simply ask to own or ask
to be told about bus names, implicitly also asking to be notified of
new bus connections. See below.

Third Piece: Asynchronous Name Ownership
===

This builds on Asynchronous DBusConnection, and allows an application
to asynchronously be notified when it owns or loses a bus name.
Generally, an app would enable or disable a particular area of
functionality according to which bus names it owns. If an app owns
only one bus name, it might exit when it loses that bus name.

Again this should probably look like an OwnershipWatcher object with
"owned" and "lost" signals, for ease of language binding. However the
convenient API in C might be something like:

void (* BusNameOwnedCallback) (DBusConnection*, const char
*well_known_name, void *data);
void (* BusNameLostCallback) (DBusConnection*, const char
*well_known_name, void *data);
enum OwnershipType { SINGLE_INSTANCE, OPTIONAL };
flags OwnershipFlags { NONE, REPLACE };
void own_bus_name(BusType bus, OwnershipType, OwnershipFlags,
BusNameOwnedCallback, BusNameLostCallback, void *data);
void own_bus_name_with_connection(DBusConnection *connection,
OwnershipType, OwnershipFlags, BusNameOwnedCallback,
BusNameLostCallback, void *data);
// and a way to remove the callbacks

Note that this API is intended to chain off the asynchronous
DBusConnection API (second piece). The version taking a BusType is a
convenience function that sets up async connection listeners in the
background and calls own_bus_name_with_connection() inside them.
Anytime there's a new bus, the process tries to own the name on that
bus.

This API also enables you to implement a --replace command line
option, which is handy for debugging. Just have your app gracefully
handle (possibly including exit) loss of the bus name.

Fourth Piece: Asynchronous Name Watching
===

Moving to the "client side" (even though many apps are both server and
client), a client often wants to watch for a bus name and take action
whenever it appears or disappears.

This works similarly to owning a name, and is also async of course.
And like the others, it should really be a NameWatcher object with
signals, the below custom callback API is really C convenience to show
the idea.

void (* BusNameAppearedCallback) (DBusConnection*, const char
*name_that_appeared, const char *name_owner, void *data);
void (* BusNameVanishedCallback) (DBusConnection*, const char
*name_that_appeared, const char *name_owner, void *data);
void watch_bus_name(BusType bus, const char *name,
BusNameAppearedCallback, BusNameVanishedCallback, void* data);
void watch_bus_name_with_connection(DBusConnection*, const char *name,
BusNameAppearedCallback, BusNameVanishedCallback, void* data);
// and a way to remove the callbacks

Of course this also chains off the asynchronous DBusConnection API,
meaning it is not usually necessary to use the DBusConnection API.

Note: because this always gives you the name owner (which is in the
NameOwnerChanged signal used to implement the name watching), it is
never necessary to do a blocking round trip to obtain the owner. It's
always known asynchronously.

This is extremely important because in the BusNameAppearedCallback you
normally create your proxy for objects inside the app owning the bus
name. These proxies can be created with the unique name owner, not the
well-known name; as a result, race conditions are avoided when the
remote app restarts. If you create proxies with the well-known name,
those proxies can refer to one process at moment 1, and another
process at moment 2. By using the unique name, the proxy always refers
to the same process. If the process exits, then
BusNameVanishedCallback should be invoked (destroy proxy here), and
BusNameAppearedCallback is then invoked on restart, where a new proxy
can be created.

The implementation of watch_bus_name would normally make an async call
to GetNameOwner when the watch is first added (if the name owner is
not already known). It calls GetNameOwner only after setting up a
match for NameOwnerChanged, otherwise there would be a race.

As with all of these callbacks, they should never be called
synchronously from the add() function; if the callback info such as
name owner is already known, an idle should be queued to notify the
app. Otherwise the app has to deal with two codepaths (one sync, one
deferred to main loop).

Fifth Piece: Asynchronous Signal Watching
===

This machinery is used to implement signals on proxies in the object
mapping. It is kept separate from any particular object mapping
because all language bindings need this functionality, and it is
fairly tricky to implement correctly.

This part of the binding has to call dbus_bus_add_match() with the
right arguments, and uses the tracked bus names from the fourth piece
to keep track of whether a signal connection is still relevant,
removing the match if it isn't.

Sixth Piece: The Object Mapping
===

All of the first five pieces can be shared among all object mappings -
they are GLib-specific because they depend on the main loop and are
probably best implemented with GObject, but they are not
binding-specific.

The final piece is the object mapping, which should be
language-specific. The object mapping's primary role: marshaling
method and signal arguments to and from some type system. Getting a
GValue-using object mapping in the way of other language object
mappings creates quite a bit of complexity and bloat. Instead, there
should be code to go directly from D-Bus types to Java, JavaScript,
Python, etc. rather than trying to convert from D-Bus then to GValue
then to another language's type system, while round-tripping exact
type signatures through these possibly not-quite-the-same systems.

An object mapping has two sides which should look different. The
"client" side has thin proxies for remote objects. The ideal proxy is
a very thin object using minimal memory, it just stores the bus name,
interface name, and object path to send method calls to. The "server"
side has object instances, which have to be registered with particular
object paths.

Proxies should almost always be created (or at least "rebound" to a
new unique bus name) in a BusNameAppearedCallback and destroyed in a
BusNameVanishedCallback. A little painful, yes, but flatly necessary
if you're going to handle the remote process restarting.

Exported instances from the server only have to exist once
BusNameOwnedCallback is called, but it's harmless if they exist
before.

An object mapping should omit entirely, or strongly discourage,
synchronous methods on proxies. In gjs D-Bus bindings, the proxy
methods are all called FooRemote(), BarRemote() and take an optional
callback as the last argument which gets the error or return value.
Proxies have no synchronous methods in gjs.

Object instances exported by servers need two modes of operation:
"synchronous" returns a value immediately, sending a dbus reply
immediately; "asynchronous" allows the server instance to defer the
reply while it does something that takes time.

The object mapping should support running with no bus daemon (bus name
of NULL). This is for peer-to-peer mode.

A secondary role of the object mapping: hiding libdbus. I don't think
there should be a stack like libdbus -> libdbus-wrapper-thing -> all
the object mappings. The reason is that *most* of the libdbus API is
only useful for bindings - for writing an object mapping. So by
wrapping libdbus and *then* doing the object mapping, your libdbus
wrapper has to be 10x larger than it would be if the object mapping
itself was the thing hiding libdbus, while the object mapping used
libdbus directly.

Anyway, that's my take. You may notice that the above (except object
mapping) more or less describes hippo-dbus-helper.c, and also
describes a hippo-dbus-helper-like file I have in the gjs dbus
bindings. While hippo-dbus-helper is not a good API (the structs
should mostly be GObjects, callbacks should mostly be signals, among
other issues), I think it is meaningful that one 2500-line (<1000
semicolon-terminated lines) file can do pieces 1-5. The gjs bindings
are completely full-featured using only a small file like that, plus a
JavaScript object mapping.

So maybe my big picture thought on eggdbus is that by also doing
object mapping and wrapping enough of libdbus to support another
binding, it is ~12x larger (in terms of source code size) than it
would be just doing 1-5. Granted, a *nice* version of 1-5 using
gobject would be larger, but not that much larger. Does it truly make
sense, to have over 90% of the code in a glib dbus binding be a C-only
convenience API? Should this be split up somehow?

I'm not saying the C-only convenience API is bad, or even that it
should not be in GLib, just that there's a very logical split between
the object mapping (90% of the code) and the rest that's more generic
(10%).

I'll see if I can come up with a followup mail or two showing what the
gjs bindings are like and discussing some specifics about eggdbus.

Havoc