Compatibility between D-Bus and kdbus

Wed Oct 1 06:33:01 PDT 2014

(Cc'd to the systemd mailing list because sd-bus is the reference
implementation of the user-space side of kdbus, but please join the dbus
list and follow-up there if you are interested in D-Bus.)

I've recently been looking at kdbus as a transport for D-Bus messages,
and how compatible or otherwise it is with traditional stream-based
D-Bus (and in particular, dbus-daemon, the reference implementation of
stream-based D-Bus). My intention here is to replace dbus-daemon on
Linux with something that does not have dbus-daemon's limitations,
but avoid making existing software non-functional or insecure in the
process. See <https://bugs.freedesktop.org/show_bug.cgi?id=84188> for my
attempts to document the current state of kdbus in the D-Bus Specification.

I have not reviewed quality-of-implementation except where I happened to
notice things as I went past, and I have not yet looked at
Samsung/Tizen's patches adding a kdbus transport to libdbus and GDBus;
for now I'm using sd-bus as my only reference for the user-space parts.

Here are the likely compatibility issues that I can see in my first pass
through:

System bus access-control policy
================================

I think this is the biggest point of incompatibility. In dbus-daemon
there is a needlessly elaborate access-control language; in kdbus, there
is a much simpler and more realistic access-control language, which
specifically does not look into message payloads.

This is far easier to reason about than what dbus-daemon does, but it's
a problem for any system service that assumes that its existing
per-interface or per-method access-control will be applied, such as
Avahi denying access to SetHostName. It is not acceptable for such
services to get instant security flaws as a result of their D-Bus
implementation being upgraded from a version that does not support kdbus
to a version that does. Unfortunately, last time we discussed this, we
didn't have a particularly good solution.

If I remember correctly, the least bad solution anyone could think of
was to introduce a new pseudo-bus-type alongside DBUS_BUS_SYSTEM (and
its equivalent in other libraries like GDBus), perhaps called
"DBUS_BUS_SYSTEM_UNTRUSTED" or something (better names welcome), with
its own shared connection: connections to that bus type are not assumed
to filter messages by their payload, and method-call recipients are
expected to use Polkit or similar, or do their own simplistic
access-controls like "must be uid 0" by calling GetConnectionUnixUser or
GetConnectionCredentials on the sender's unique name.

On non-kdbus systems, it would just be the same thing as
DBUS_BUS_SYSTEM; on kdbus systems, DBUS_BUS_SYSTEM would refuse to use
the kdbus transport, and fall back to stream-based D-Bus compatibility
mechanisms like systemd-bus-proxyd. That would enable individual
libraries and applications to opt-in to using this new shared connection
when they have been audited for safety, with the goal being that
everything eventually moved to it, and nothing connected to
DBUS_BUS_SYSTEM any more.

I'm also very tempted to propose a syntax for an opt-in kdbus-like
security model (which would take precedence over system.conf/system.d)
via adding lines to .service files, so that individual services can have
a sane security model on non-kdbus or non-Linux systems, and systemd's
systemd-dbus1-generator could use those lines as input. If we get far
enough with moving system services to that, maybe the transition can be
easier.

Session bus access-control policy
=================================

In principle, people could configure the session bus to do the same
elaborate access-control as the system bus. In practice, this is not a
particularly useful thing to do, because there are many ways for
processes running under the same uid to subvert each other, particularly
if a LSM like SELinux or AppArmor is not used.

kdbus does not appear to make any attempt to protect a uid from itself:
the uid that created a bus is considered to be privileged on that bus. I
assume this means that the intention is that app sandboxing will use a
separate Unix uid, like it does on Android?

Unless there's an outcry from people who like LSMs, I'm inclined to say
that protecting same-uid session processes from each other is doomed to
failure, and hence that it's OK for DBUS_BUS_SESSION to connect to kdbus
without special precautions.

Resource limits
===============

Some resource limits are lower in kdbus than in dbus-daemon.

In kdbus, the number of unread messages per recipient is limited to 256,
with up to 16 per uid; subsequent broadcasts are silently dropped, and
subsequent unicast messages cause the sender to block.

The message header (fixed-length header and header fields) is limited to
2 MiB in kdbus, whereas on dbus-daemon it may be up to a configurable
limit, by default 32 MiB for the system bus or 128 MiB for the session
bus. Broadcast messages (header + body) have the same 2 MiB limit, but
unicast message bodies may be any size: kdbus itself does not impose any
limit. I don't know whether anything sends broadcasts as large as 2 MiB
(Tracker perhaps?): if you do, please share.

In kdbus, each uid may create up to 16 buses. In dbus-daemon there is no
limit. I do not anticipate this being a problem: the reference
implementation of kdbus' user-space, in systemd, seems to be using a
per-uid user bus instead of a session bus. Also, even if we continued to
use a session bus per login, 16 sounds like a reasonable number.

In kdbus, each uid may connect to each bus up to 256 times. I think this
is actually somewhat likely to be a practical problem: I currently have
46 connections to my session bus, so I'm only an order of magnitude away
from the session breaking.
<https://code.google.com/p/d-bus/issues/detail?id=9>

In kdbus, each connection may own up to 64 well-known names; the system
dbus-daemon defaults to 512, and the session to 50 000. 64 is *probably*
enough, but I could potentially see this becoming an issue for services
that allocate one well-known name per $app_specific_thing, like
Telepathy (one name per Connection).

In kdbus, each connection may have 256 bloom filter entries, which AIUI
are slightly less expressive than match rules (one match rule maps to
one or more match rules). The system dbus-daemon defaults to allowing
512 match rules, and the session to 50 000. Again, I could potentially
see this being an issue for existing code: Alban's benchmarking for
GNOME + Telepathy back in 2011 revealed a peak of 81 match rules,
although admittedly some of those were due to a dbus-glib bug[2]. QtDBus
adds more / finer-grained match rules than GDBus or dbus-glib, so it
might get this worse. I've opened a bug with a possible mitigation:
<https://code.google.com/p/d-bus/issues/detail?id=10>

[1] http://people.collabora.com/~alban/d/2011/02/gnome-telepathy.csv
[2] https://bugs.freedesktop.org/show_bug.cgi?id=33646

kdbus has a hard maximum on the reply timeout, whereas stream-based
D-Bus has DBUS_TIMEOUT_INFINITE. However, the hard maximum is nearly 585
years, so I don't see this being a practical problem :-)

kdbus allows 128 pending replies at a time (i.e. parallel method calls)
per sender connection; this matches the system dbus-daemon, but is less
than the session dbus-daemon. I don't think this is going to be a
practical problem.

kdbus' handling of resource limits is at least considerably more
graceful than in stream-based D-Bus: it can return an error immediately
and continue processing, rather than having to disconnect the offending
sender.

fd-passing
==========

In stream-based D-Bus, any file descriptor may be attached to a message
whose transport is a Unix domain socket, including another Unix domain
socket. In kdbus, kdbus file descriptors and Unix domain sockets are
currently specifically disallowed, to avoid recursion. I am not aware of
any applications that actually do this: the developers of Tracker
considered it, but ended up using a pipe() instead.

In stream-based D-Bus, it is valid to attach a file descriptor to a
broadcast message. In kdbus, it is not. I am not aware of any
applications that actually do this.

Bi-endian systems
=================

The reference implementation of kdbus' userland part, in systemd,
specifically requires that message payloads must be in native endianness
for simplicity.

It is not clear to me what this would mean for CPU architectures that
have runtime-switchable endianness, namely arm, powerpc and possibly
mips. On these architectures, does Linux effectively impose the
additional restriction that every process must run in the same
endianness as the kernel, or what?

Similarly, a big-endian per-process emulator like qemu-m68k on a
little-endian system like x86-64 would have to either not implement the
kdbus ioctls (resulting in emulated processes falling back to
stream-based D-Bus), or byteswap the message. It presumably already has
to know how to byteswap struct contents.

Message ordering
================

The D-Bus Specification still doesn't define message ordering, which is
a significant omission. However, dbus-daemon has always imposed a "total
order" on messages: if two connections A and B both observe messages M1
and M2, and A observes M1 before M2, then B cannot observe M2 before M1,
unless it uses a library API that "jumps the queue" by using
pseudo-blocking[3].

[3] http://smcv.pseudorandom.co.uk/2008/11/nonblocking/

kdbus has these departures from total ordering:

* If A is the addressed recipient of M1 and M2, and B is an
  eavesdropper, B might see M2 first. This could be addressed,
  at some performance cost, by making sure to hold a lock while
  delivering messages if there is at least one eavesdropper.

* There are ioctl APIs that cause messages to "jump the queue",
  either based on priority or by making a synchronous method call.

* Some operations that are method calls in stream-based D-Bus are
  synchronous ioctls in kdbus. This can result in apparently
  paradoxical situations like seeing a name in the equivalent of
  ListNames before receiving the notification that it has an owner,
  because the notification is processed asynchronously.
  (Mitigation: it is fairly common to use "pseudo-blocking"
  for these calls anyway.)

Comments, corrections? (To the dbus list, please.)

Regards,
    S