"DBus Embedded" - a clean break
hp at pobox.com
Wed Jan 26 08:48:17 PST 2011
I do think the fastest possible dbus lib would look more like Xlib,
where you don't have a message object. The API could allow passing in
something iovec-like to avoid having to copy data blocks just to get
them contiguous or put a length in front or whatever. Such an API
would almost certainly be a blocking API though it wouldn't have to
block for a round trip, just on the write. I think you'd want to avoid
the DBusString abstraction for marshaling (to marshal an int you call
a function to ensure the malloc'd buffer has 4 bytes and then you
marshal), instead again more xcb/xlib-like, you'd want stuff to be on
the stack and try to just assign to struct fields or whatever. I don't
know, anyway basically pull out the abstractions and make it more like
just taking the data in a raw format, most of it on the stack, and
making writev() calls to push it onto the socket.
Of course, an API like that would be a good bit more annoying to use
in some ways, so people might write wrappers around it, which could
defeat the speed win.
It could be pretty easy to experiment here by hacking up a local
libdbus to have a way to "steal" the socket from a DBusConnection, so
you can let existing libdbus code do the connection setup, then
experiment with and profile different ways to marshal and unmarshal
messages. Heck it probably works well enough for a benchmark, to just
open/auth the DBusConnection, then get the fd from it, use it
directly, and never call into libdbus again. Don't set up or use main
loop handlers obviously.
This is all more difficult since the dbus type system is fully
recursive; it was faster before that and dropped in perf pretty
significantly when I rewrote to be fully recursive. In part because
optimization work was just lost, I guess. But, people really really
really lobbied for the fully generic type system and that was probably
more important than max performance.
"Less abstraction and convenience" can really be helpful for speed.
There's at least some plain-old-stupid to be optimized in libdbus and
gdbus no doubt, but there's also some amount of simplifying app logic
by making dbus do more.
Assuming the benchmarks show a big win, having a "raw" iovec-like dbus
API in libdbus itself could make sense, because bindings that generate
static stubs could transparently switch to it. It could even coexist
with DBusMessage if DBusConnection had an API to flush and lock the
socket and then you could do raw access.
More information about the dbus