Some comments on the D-Bus specification

Wed Apr 22 10:16:45 PDT 2015

On 22/04/15 15:15, Havoc Pennington wrote:
> On Tue, Apr 21, 2015 at 9:50 PM, George Spelvin <linux at horizon.com> wrote:
>> As part of the lkml debate on kdbus, I'm reading and trying to understand
>> the D-bus specification.

I'd be happy to review/merge spec patches clarifying these, but I'm not
able to write them right now, and I suspect I'd be more valuable as a
reviewer here anyway (since reviewer/committer time is often the thing
that's in short supply).

Some answers, though:

>> 1. Is there a "void" type?  A dictionary from key to void
>>    seems like a natural way of representing a set of keys.

No, there isn't. Representing sets as arrays, and documenting that
members are not repeated, is conventional. IIRC that's also a more
efficient wire-protocol encoding for sets of small objects, because
dict-entries are 8-byte-aligned, whereas array members have their
natural alignment (often 4 bytes).

Note that kdbus is not D-Bus 1.0, and the user-space parts of the
protocol whose kernel side is kdbus do not use the D-Bus 1.0 wire
protocol. <https://bugs.freedesktop.org/show_bug.cgi?id=84188> (The
kernel parts deliberately don't care what you send.)

kdbus could in principle allow the GVariant type extensions - maybe/m,
empty struct/(), and isolated dict-entries. I would advise against it
though - they don't necessarily add enough additional value for it to be
worthwhile to think about the feature-negotiation - and AIUI systemd's
implementation of the user-space counterpart for kdbus does not allow them.

>> 2. Is alignment padding only before objects, or also after?

The canonical reference would be the implementation, but I believe the
answer is "only before".

Note that the user-space parts of kdbus do not use the D-Bus 1.0
wire-protocol encoding: instead, they use GVariant, which has different
rules. <https://bugs.freedesktop.org/show_bug.cgi?id=84188>
The kernel part doesn't care what you send; the payload is an opaque
binary blob from the kernel's point of view.

>> 3. Why is the request serial number passed in the reply's REPLY_SERIAL
>>    header extension field, and not simply in the reply header serial
>>    number?

"Because that's how Havoc wrote it". I believe the conceptual model is
that literally every message has a serial number, whether it would make
sense to reply to it or not (it doesn't make sense to reply to
broadcasts, but they still get serial numbers), rather than having a
more complicated rule about "messages only have a serial number if they
need it" and having to define which ones need it.

>>    Can METHOD_RETURN or ERROR messages themselves generate errors?

Yes in theory, but in practice I think only dbus-daemon does this, and
everyone ignores them because you can't know whether to expect one (so
they're only practically useful for logging/monitoring, not for APIs).

>> 4. May interface names end with a period?

No. (Interface, member) pairs are sometimes written like
"org.freedesktop.DBus.Introspectable.Introspect" in pseudocode or in
mini-languages such as IDLs and command-line interfaces, but the last
dot there is just syntax: the interface is
"org.freedesktop.DBus.Introspectable" without a trailing dot, and the
member is "Introspect".

>>    It seems that a simple way to specify an interface name is
>>    with the regexp:
>>
>>    [a-zA-Z0-9_]+(\.[a-zA-Z0-9_]+)+\.?

A trailing dot is not allowed. Otherwise, I think that's accurate,
assuming Perl5/Python/PCRE/PHP/JS-compatible syntax (which is
unfortunately not the only regexp syntax).

>> 5. May bus names contain a colon?  The specification simultaneously
>>    requires a leading colon and forbids any colons in bus names.

I think the rule that is actually enforced is that unique connection
names must start with a colon and contain no further colons, so:

>>    The specification apears to say that a valid well-known bus name
>>    matches the regexp
>>
>>    [-a-zA-Z_][-a-zA-Z0-9_]*(\.[-a-zA-Z_][-a-zA-Z0-9_]*)+\.?

A trailing dot is not allowed, but otherwise that looks accurate.

>>    and a unique connection name matches the regexp
>>
>>    [-a-zA-Z0-9_]+(\.[-a-zA-Z0-9_]+)+\.?

I think it's correctly ":[-a-zA-Z0-9_]+(\.[-a-zA-Z0-9_]+)+" (where the
double quotes are delimiters, not part of the regex).

If in doubt, the validation in libdbus is correct.

>>    Which is, I note, a strict superset of the well-known bus names.

The required leading ":" disambiguates. If you skip the ":", yes the
rest is a strict superset.

>> 6. It might also be worth describing such names as "ASCII strings"

I would hope that that's well-understood :-) but if someone wants to
quote chapter and verse from ANSI X3.4-1968, I'd review a patch.

I don't think anything in D-Bus actually relies on the definition of
"ASCII string": everything is either machine-readable and in a defined
strict subset of ASCII, or the "string" type (which is valid Unicode
minus U+0000, encoded in canonical (i.e. not over-long) UTF-8).

>> 7. What if ownership of a DNS name changes?  Does that also transfer the
>>    right to define the interface?  Or only the right to revise it?

Like any social convention, this is up to your conscience. There is no
technical constraint preventing you from using just.what.are.namespaces
(or org.freedesktop.Telepathy for that matter) as your well-known bus
name, only the social constraint that you shouldn't. The use of reversed
domain names is a suggested way to get non-colliding names for APIs,
nothing more.

The only reversed domain name that is special to the protocol is
"org.freedesktop.DBus", which is syntactically a well-known name but
behaves like some sort of well-known/unique hybrid. kdbus does not have
this name, because there's no bus daemon there.

Similarly, there is no technical constraint preventing you from replying
to org.freedesktop.DBus.Introspectable.Introspect with a byte-array
containing a picture of your cat, just like there is no technical
constraint preventing you from redefining malloc() to always return
0x42, or using version number 1.2.4 to identify two different software
releases, or storing your holiday photos in /lib64. But you still
*shouldn't* do those things, and if you do, you get to keep both pieces :-)

If I was forced to define a rule, I'd say "the answer is the same as it
is for Java package naming, whatever that is" (since I believe Java is
where Havoc got this naming convention from).

(I do consider it to be always-a-bug if D-Bus peers can be crashed by
unexpected message contents, whether caused by a malicious sender or a
careless API break. They can't necessarily be expected to *work*, but
they should send back an error response if appropriate, log a warning if
appropriate, and generally fail gracefully, not crash with an assertion
failure.)

>> 8. In the DBUS_COOKIE_SHA1 mechanism, are the various strings
>>    hex-decoded before computing the SHA-1?
(also applies to point 9)

The SASLing is ill-specified, and I'd be delighted to review a patch
that describes what the interoperable implementations (libdbus, GDBus,
etc.) actually do.

Note that kdbus doesn't use this part of the specification: it always
uses the equivalent of D-Bus' EXTERNAL implementation, which uses
SO_PEERCRED or local equivalent.

Note also that "dbus-daemon --system" only accepts SASL EXTERNAL by
default (and hence its default configuration will just not work on OSs
where we don't know how credentials-passing works - IMO this is a
feature not a bug). "dbus-daemon --session" will accept DBUS_COOKIE_SHA1
if requested, but practical clients all try EXTERNAL first (and some,
like sd-bus, do not support anything else).

I believe DBUS_COOKIE_SHA1 is intended for tenuous situations involving
"dbus-daemon --session" over TCP with an NFS-shared home directory,
hence it assumes a secure LAN and is rather dubious in general. If I was
any less concerned about compatibility, I'd delete it from dbus entirely.

>> 10. Is it worth documenting that the creation timestamp will exceed
>>    32 bits in the year 2100?

Anyone still not using an OS with the Unix credentials-passing necessary
for EXTERNAL in 85 years' time (or for that matter in 2038) gets what
they deserve :-P

>> 11. In the cookie file format, why is any non-zero time in the future
>>     permissible?  Creation times should be strictly in the past, no?

I don't know. The tenuous "I have an NFS-shared home directory and
blindly trust my local LAN" use-case might come with clock-skew issues?
As I said, I'd delete this entire mechanism if I thought I could get
away with it.

Happily, kdbus does not support non-local D-Bus, reducing its scope to
the situations that actually work.

>> 12. The syntax of systemd transport addresses is completely unspecified.
>>     Presumably they begin with "systemd:", but even that is unclear.

The address is exactly "systemd:" and no parameters have been defined or
are needed; all the other information required is inherited from the
parent process (in environment variables and non-CLOEXEC fds), using
systemd's LISTEN_FDS protocol.

>> 13. The section "message bus names" is confusing on the subject of
>>      unique names.
>>
>>      In particular, when a connection is closed with a next connection
>>      in the queue, is the unique name transferred to the next connection?

Unique names are, well, unique - they are not recycled, either
aggressively like fds or gradually like pids. A particular unique
connection name will be given to at most one connection during the
lifetime of the dbus-daemon. Clients may (and do) assume that if they
have seen ":1.42" disappear, it will never return.

(This means that the dbus-daemon as currently implemented will stop
working after between 2**63 and 2**64 connections, because it currently
hands out unique names of the form :n.m where n, m are 32-bit and n > 0.
The spec does not guarantee this, so we could make unique names longer
if someone outlines a realistic situation in which 64 bits are not enough.)

kdbus uses "flat" 64-bit integers. The same general considerations apply
but the syntax is different.

>>      May a unique name be passed to org.freedesktop.DBus.ReleaseName?

No. Until I added BecomeMonitor, there was no way to lose your unique
name (and it's only acceptable for BecomeMonitor because BecomeMonitor
also removes your ability to send messages; see the commit log or the
bug for design justification).

Note that kdbus does not have ReleaseName or BecomeMonitor.

>> 14. Allegedly, there are strong guarantees on what a client can receive
>>     over a dbus connection.  I can't find that stated anywhere.
...
>>     - Are there *any* possible messages I may receive before calling
>>       org.freedesktop.DBus.Hello()?

No. If you do, that is considered invalid and you should disconnect.

Note that kdbus does not have Hello.

>>     - Are there any possible messages I may receive between the METHOD_CALL
>>       and METHOD_RETURN for Hello?  In particular, is the
>>       org.freedesktop.DBus.NameAcquired SIGNAL for the unique name
>>       delivered before or after the METHOD_RETURN?

Currently an implementation detail. I'd be happy to review a spec patch
that described what dbus-daemon actually does, and/or a regression test
that insists that it continues to do what it currently does (actually,
my tests for BecomeMonitor might do this as a side-effect).

Note that kdbus does not have Hello or NameAcquired; it only has an
equivalent of NameOwnerChanged.

>>     - Likewise, for org.freedesktop.DBus.Monitoring.BecomeMonitor, do the
>>       various NameLost signals arrive before or after the METHOD_RETURN?

Likewise.

Becoming a monitor is a sysadmin/developer tool rather than something
that should be done in production. kdbus does it a bit differently: on
kdbus, a monitoring connection *connects* specially, and never gets any
names (not even a unique name) to begin with.

>>     - May another client send me a spurious or duplicate METHOD_RETURN?
>>     - How about both an ERROR and METHOD_RETURN?

On the system bus, which is a privilege boundary, the default policy is
that if a connection successfully sends a message with serial number 42
that expects a reply (is a method call without the NO_REPLY flag), the
dbus-daemon will remember to allow that message's recipient to send back
exactly one matching reply with "in reply to 42". Unsolicited "replies"
after that, or from the wrong connection, are not allowed, and will be
discarded by the dbus-daemon with a syslog message.

On the session bus, which is not a privilege boundary, the default
policy is that anyone can send you an unsolicited "reply" (either
success or error) with an arbitrary "in reply to" serial number. If you
are expecting a reply to message #42, in practice, the first reply
claiming to be in reply to message #42 will be processed. For subsequent
"replies", any practical client library will search its internal data
structures for a pending-reply data structure with serial number 42,
fail to find it, and ignore the "reply" instead.

It is possible to reconfigure the session bus to behave like the system
bus in this respect; its default configuration is maximally permissive.

kdbus does this differently; I don't remember the precise details, but I
suspect it might behave like the system bus in all cases.

>>     - How does a server terminate cleanly, that is with an appropriate
>>       METHOD_RETURN for every METHOD_CALL it has received?

It can't, without special steps from the service (one suggestion
involved owning two distinct well-known names), because closing the
connection can race with receiving new method calls that you would like
to process.

This is a long-standing design bug
<https://bugs.freedesktop.org/show_bug.cgi?id=11454> which, AIUI, is
fixed by kdbus: because the communication between the client and the
kernel is synchronous (unlike the communication between a client and
dbus-daemon), there can be an ioctl for "if there are no messages queued
for me, close the connection" which works atomically.

This is in the general category of "as long as the transport is
basically AF_UNIX, we can't actually fix this".

>>    - What if a server exits without sending a METHOD_RETURN?

If it has sent neither a METHOD_RETURN nor an ERROR, the dbus-daemon
synthesizes an ERROR that has the service as its claimed sender, and
sends it on the service's behalf.

>>    - What if I use a message serial of 0

If I remember correctly, that is considered invalid and the dbus-daemon
will disconnect you.

>>      or a message serial of
>>      1 for all my METHOD_CALLs?

That is syntactically valid, but silly. You will become unable to
determine how replies and calls match up, and you have only yourself to
blame :-)

(Strictly speaking I think you only need a unique serial number per
(sender,destination) pair if you are using a unique name as destination,
you can re-use serial numbers if you aren't expecting a reply or you've
already had the expected reply, and there are probably other tricks -
but in practice everyone just uses sequential numbers, which is far easier.)

>>    - Is there any limit on the number of outstanding METHOD_CALLs?

Conceptually: no.

Practically: yes, there is an arbitrary limit, because the dbus-daemon
uses memory to track each expected reply. The system bus, which is a
privilege boundary, gives each connection a limited number of "slots"
for parallel method calls, and has some error behaviour if it is
exceeded (I think it sends back an ERROR). The session bus, which is not
a privilege boundary, sets all of its arbitrary limits to some
ridiculously high value by default, so it will run out of memory on your
behalf if asked to do so; it can be configured to be more like the
system bus if required.

-- 
Simon McVittie
Collabora Ltd. <http://www.collabora.com/>