"DBus Embedded" - a clean break

Thu Jan 20 10:05:08 PST 2011

Hi,

You're just assuming a bunch of stuff here.

On Thu, Jan 20, 2011 at 2:53 AM, Ville M. Vainio <vivainio at gmail.com> wrote:
> - It could be much (10x?) faster - switch to shared memory, posix
> message queues for data transmission (references to relevant shared
> memory blocks), do not do any verification

Do not do any verification is clearly faster, but is already possible
with a 1-line change to the current code (grep for
DBUS_VALIDATION_MODE_DATA_IS_UNTRUSTED)

shared mem / message queues are only hypothetically faster than unix
sockets, at best. Once you add the locking and other crap to use them
you are likely to end up without a huge win.
unix sockets are *fast*

Which implies that if you profile you might find unix sockets are not
the problem.

If they are the problem, though, adding a different transport to the
current code is straightforward.

> - Switch to peer-to-peer communication as soon as possible while doing
> the handshake through daemon (minimize context switches, transmission
> counts).

Just make a peer to peer connection with existing dbus, both libdbus
and other implementations can do it. then you can profile this.

Why not profile other implementations, btw, such as gdbus?

It doesn't make sense to say "just switch to peer to peer" because p2p
doesn't have the same functionality. You can already choose daemon or
direct, but ifi you need the daemon's functionality, then you need it.

In a desktop context, peer to peer also has the major downside that
now you need (in theory) N-factorial or so sockets instead of N
sockets where N = number of apps. So while it might be faster, you
might also run out of file descriptors. Which actually used to happen
with ORBit, btw.

> - Retain the current serialization functionality. However, provide a
> way to skip it (since people say it's one reason why dbus is slow) and
> transmit raw binary data very quickly.

What is your "raw binary" that is so different from the current wire
protocol? The current protocol is a binary protocol. The
representation of a double is just the same as the machine's in-memory
representation of a double, for example.

> Implementing this might be easier than fixing dbus, and it could get
> rid of thread synchronization problems. It would definitely fix speed.
> Thoughts?

1. I agree that you could write a much faster library. libdbus is far
too flexible (configurable main loop, oom handling, strict validation,
etc.) to be as fast as possible.

2. I don't think you have a clue why libdbus is slow or why a new lib
would be faster, so you better figure that out first. *removing*
flexibility will be key. for example eliminate mainloop, allow only
blocking usage, is an option because it lets you avoid copies into an
intermediate buffer/queue.

3. some existing reimplementations such as gdbus may already be faster
or have the potential to be

4. I do think if the speed was an issue on desktop, someone would have
tried to address some of the easy wins. I suspect some of the speed
issues on embedded are from people trying to use the wrong tool for
the job. e.g. if you think removing the daemon is the solution, why
were you using the daemon anyhow?

5. there *are* some really good things about libdbus efficiency-wise:
it works pretty hard to avoid extra memcpy() of the messages including
as they are modified and as they move through the daemon, and it works
very hard to let you avoid blocking round trips.

btw, if your benchmark or app code is synchronous (blocks for each
reply before sending next message) then that is the first thing to
fix. don't even begin to look at dbus until you get rid of that. if
your API requires a lot of blocking (say you need to get a foo, then
pass it in to get a bar, then pass the bar in to get a baz, etc.) then
FIX YOUR API. That kind of round-trip mess is inherently slow as hell
no matter what you do to your IPC layer.

the bottom line is you need to profile your actual app that's a
problem, and start with that.

Havoc