[PATCH 0/5] RFC: Multicast and filtering features on AF_UNIX

Fri Sep 24 18:09:36 PDT 2010

Hi,

Dumping most of CC since my comments are really about dbus, not the
kernel patches

On Fri, Sep 24, 2010 at 1:22 PM, Alban Crequy
<alban.crequy at collabora.co.uk> wrote:
> Another possibility is to add the needed features directly in AF_UNIX
> sockets (and avoid to create a new address family for D-Bus):
>
> - multicast
> - some kind of BSD socket filters?
> - untables?
>

It seems like the routing is the handwavy part (I mean, that is what
the userspace dbus-daemon is doing). Any client connected to the
daemon can end up getting any message depending on what that client
has requested. Perhaps it would be possible to optimize only the very
simplest, but also most common, case, which is that a message has a
specific client as destination, and no other clients have asked to
eavesdrop on that message. I don't know what percentage of dbus
traffic this would cover but it might be a lot (unless running some
kind of monitor app). The bus daemon could maintain a table of
destinations that are safe to route in this way.

One question I have is how performance issues you are seeing break down among:
* just plain overhead (parsing protocol, routing); I think if you
compare to rawest-possible no-protocol sockets that there is quite a
bit of this overhead in dbus
* context switches
* copying data

Those seem like three things with distinct possible solutions.

For copying data, some way for dbus to ask to go from 1 incoming
socket to N outgoing sockets without having to copy into a buffer
first, would seem very useful.

For all I know (vm)splice() or tee() or something can already be made
to do this. I'm not sure I understand these calls but it sounds
roughly like you might have to put a pipe in front of all the sockets
to use as the buffer, and then you could tee() from the incoming
buffer to N-1 outgoing buffers and finally splice() to the last
outgoing buffer, or something like that. Or if you need to look at
data (for header yes, for body maybe not) it sounds like you could
tee() the header into a pipe, then read() to parse it, then use the
tee'd copy to copy to the outgoing. No idea if that's actually a win.
Maybe only if there are lots of recipients.

You could imagine a protocol change designed to make messages routable
with a fixed-length short header, rather than the extensible header
format. The fixed-length short header could contain the destination
name with padding out to the fixed length. One way to accomplish this
would be to add a header flag (in the existing third byte) which would
mean that the header fields were in some particular order and possibly
that they had padding to fixed length. We could then read the header
assuming this flag was set; if it is, then the flag's semantics would
be such that the number of bytes read would be exactly the amount of
data needed to route the message, and such that the routing data could
be found at a fixed offset without parsing. The daemon would read
hoping for the flag, if the flag was unset it would parse normally, if
it was set then it would just look at the fixed offsets. Clients could
use the flag for "simple" messages and use extensible header format
when needed.

I really think there's quite a bit of win possible in the "just plain
overhead" category. Though it could end up needing some major
rewriting. One question is whether gdbus is already faster and maybe
easier to optimize, though that won't help with the daemon.

A consideration to keep in mind is that there's always been some
debate over whether clients of the daemon need to validate protocol.
Right now, with libdbus, both clients and the daemon validate. An
alternative is that only the daemon validates and all clients trust
the daemon. There's really no reason not to do that. However, if you
make the daemon not validate then all clients would have to, and so
this would probably be a false performance gain. For the session but
not the system bus, you could disable all validation for both daemon
and clients, however that would raise the possibility for one buggy
client to trigger assertion failures and crash the entire session.
With good _dbus_return_if_fail/g_return_if_fail coverage this risk
might be low.

Havoc