"DBus Embedded" - a clean break

Wed Jan 26 06:20:38 PST 2011

On Thu, 2011-01-20 at 09:53 +0200, Ville M. Vainio wrote:

Hi Ville M. Vainio,

> Just some subversive thoughts (not a practical development project for now):
> 
> - Dbus is very slow. It's okay on desktop, but on mobile platforms
> it's suboptimal
> 
> - It could be much (10x?) faster - switch to shared memory, posix
> message queues for data transmission (references to relevant shared
> memory blocks), do not do any verification

This implies changing libraries like GDBus, which doesn't use libdbus,
too. I'm also not sure how portable shm and posix message queues are.

I also think that since dbus messages are short-lived, that shm isn't
ideal. A shared memory buffer is more useful when the data in it is long
lived in my opinion.

> - Switch to peer-to-peer communication as soon as possible while doing
> the handshake through daemon (minimize context switches, transmission
> counts).

Tracker has historically been a heavy user of D-Bus.

With heavy I mean that we pass data over D-Bus. For data acquisition the
applications need to pass data to us, for data servicing we need to pass
data to applications. Nowadays we have a so-called direct-access mode
that allows processes to connect to our backing database file directly;
that's besides the point here.

A year or so ago a then student came to me asking for an internship
task. I tasked Adrien Bustany with investigating how we can improve IPC
in Tracker [1].

Both Adrien and me concluded that the performance penalty lies not in
the fact that a socket is used, but rather in the serialization at both
ends. Adrien's report [1] shows that a socket, and a fd, can be fast.

What many D-Bus libraries do is that they let the programmer build a
'message'. Like the DBusMessage or the GDBusMessage. Unfortunately these
libraries don't immediately serialize that message to a format that can
be instantly send over the 'wire'. They usually first need to be
converted to a new allocation, to a new buffer.

It's precisely this what Adrien avoided in his solution for Tracker.
When Alexander suggested this in a comment [2] on the report, we ported
that lightweight serialization solution to D-Bus's FD passing. The
reason why the solution is fast(er) is not FD passing vs. socket or
another other 'wire' (although it makes a small difference), but the
fact that serialization is almost unnecessary: Offsets are pre-
calculated and passed, strlen and malloc are avoided, etc.

When you add layers like QtDBus, dbus-glib on top of a layer that
already makes a copy of the message, does malloc and strlen; then yes
things are slow. Unsurprisingly. I'm sure that whoever did some
investigation into the slowness of D-Bus comes to similar conclusions.

Unfortunately is a shiny new D-Bus library making a similar mistake with
its g_dbus_message_to_blob in write_message_continue_writing and
maybe_write_next_message stuff in its gdbusprivate.c.

I briefly discussed this with Ryan who worked on GVariant. He told me
among the reasons why it is like this is because D-Bus's wire protocol
wants the total message size upfront.

With GDBusMessage's GVariant you can only get this total size by
iterating over the entire variant. Afterward you'd have to iterate again
to write the variant's data to the 'wire'. Two iterations would probably
not be much faster than an allocation and a copy of the variant to a
blob pre sending it.

It stuns me a little bit that while the programmer builds a GDBusMessage
(using a GVariant builder or whatever) it can't be done that internally
a message-length is kept correct.

I read the code of GDBusMessage and GVariant and understand how come
it's impossible at this moment (unless GVariant becomes intelligent
about D-Bus's wire-format, or if it acquires a 'i-changed' signal that
the parent GDBusMessage listens for to update its own message-length).

But still ..

I have not read libdbus's DBusMessage handling but I guess it uses a
similar technique (copy to a blob, then send). Measurements in Adrien's
tests indicate that it does (look at libdbus's memory usage in the
report [1] -- at the point of sending the message there's a small spike
in the client).

It might also be useful if the 'wire' protocol wouldn't require putting
the total size upfront, of course.

Ideally inside the iteration-loop over such a 'message' will the library
write chunks of data to the 'wire' already, and no conversion or copy to
a blob should be needed (writing to the socket happens in the code that
is now the body of g_dbus_message_to_blob).

...

Today and in practice should all applications that pass significant
amounts of data around, use D-Bus's FD passing. Tracker does it, GVFs
does it. It's proven technology that works great.

--- 
[1] http://pvanhoof.be/blog/index.php/2010/05/13/ipc-performance-the-report
    http://blogs.gnome.org/abustany/2010/05/20/ipc-performance-the-return-of-the-report/
[2] http://pvanhoof.be/blog/index.php/2010/05/13/ipc-performance-the-report#comment-2198

> - It should be possible to make apps using libdbus to still work with
> this new structure, to retain application compatibility. Obviously
> wire protocol disappears completely.

> - Retain the current serialization functionality. However, provide a
> way to skip it (since people say it's one reason why dbus is slow) and
> transmit raw binary data very quickly.
> 
> Implementing this might be easier than fixing dbus, and it could get
> rid of thread synchronization problems. It would definitely fix speed.
> Thoughts?

Cheers,

Philip

-- 

Philip Van Hoof
freelance software developer
Codeminded BVBA - http://codeminded.be