[systemd-devel] Compatibility between D-Bus and kdbus
Thiago Macieira
thiago at kde.org
Mon Nov 24 18:40:10 PST 2014
Sorry for the delay in replying. I didn't have time until after KLF...
On Wednesday 01 October 2014 14:33:01 Simon McVittie wrote:
> System bus access-control policy
> ================================
[snip]
> If I remember correctly, the least bad solution anyone could think of
> was to introduce a new pseudo-bus-type alongside DBUS_BUS_SYSTEM (and
> its equivalent in other libraries like GDBus), perhaps called
> "DBUS_BUS_SYSTEM_UNTRUSTED" or something (better names welcome), with
> its own shared connection: connections to that bus type are not assumed
> to filter messages by their payload, and method-call recipients are
> expected to use Polkit or similar, or do their own simplistic
> access-controls like "must be uid 0" by calling GetConnectionUnixUser or
> GetConnectionCredentials on the sender's unique name.
I'm wondering if the same solution should be applied to the session bus. That
would have the unfortunate effect that applications that aren't ported to know
about kdbus will always fallback to proxy functionality. It would be
unfortunate because the number of applications that need policy decisions on
the session bust must be asymptotically close to zero.
An alternative solution would be for the "trusted" connection to check if
there are any files in /etc/dbus-1/session.d. If there aren't, it can assume
that trusted == untrusted.
PS: we should also find a name that conveys that you *want* the new bus type.
> I'm also very tempted to propose a syntax for an opt-in kdbus-like
> security model (which would take precedence over system.conf/system.d)
> via adding lines to .service files, so that individual services can have
> a sane security model on non-kdbus or non-Linux systems, and systemd's
> systemd-dbus1-generator could use those lines as input. If we get far
> enough with moving system services to that, maybe the transition can be
> easier.
I like this idea, with the proviso that Lennart pointed out: we need to first
include the metadata in the dbus1 stream messages so the application can sort
out the policy decisions like the kdbus implementation would.
> Session bus access-control policy
> =================================
>
> In principle, people could configure the session bus to do the same
> elaborate access-control as the system bus. In practice, this is not a
> particularly useful thing to do, because there are many ways for
> processes running under the same uid to subvert each other, particularly
> if a LSM like SELinux or AppArmor is not used.
>
> kdbus does not appear to make any attempt to protect a uid from itself:
> the uid that created a bus is considered to be privileged on that bus. I
> assume this means that the intention is that app sandboxing will use a
> separate Unix uid, like it does on Android?
>
> Unless there's an outcry from people who like LSMs, I'm inclined to say
> that protecting same-uid session processes from each other is doomed to
> failure, and hence that it's OK for DBUS_BUS_SESSION to connect to kdbus
> without special precautions.
I don't understand this domain enough to be able to offer an opinion. I know
that Tizen will want SMACK security applied even between processes of the same
UID. I just don't know whether that maps to what Lennart said about "labels".
> Resource limits
> ===============
[snip]
> In kdbus, each uid may create up to 16 buses. In dbus-daemon there is no
> limit. I do not anticipate this being a problem: the reference
> implementation of kdbus' user-space, in systemd, seems to be using a
> per-uid user bus instead of a session bus. Also, even if we continued to
> use a session bus per login, 16 sounds like a reasonable number.
Agreed. Most people I know that start more than one bus for the same user are
doing it for nested sessions, usually for testing purposes or to keep separate
implementations of equivalent services (early KDE Frameworks 5 releases
recommended being separate from KDE 4, not so any more).
> In kdbus, each uid may connect to each bus up to 256 times. I think this
> is actually somewhat likely to be a practical problem: I currently have
> 46 connections to my session bus, so I'm only an order of magnitude away
> from the session breaking.
> <https://code.google.com/p/d-bus/issues/detail?id=9>
Agreed again.
$ qdbus | grep -c \:
72
That's over 25% of the limit. Can this be made runtime-configurable?
> In kdbus, each connection may own up to 64 well-known names; the system
> dbus-daemon defaults to 512, and the session to 50 000. 64 is *probably*
> enough, but I could potentially see this becoming an issue for services
> that allocate one well-known name per $app_specific_thing, like
> Telepathy (one name per Connection).
Also be made configurable, but please raise the default to 256 or 512.
> In kdbus, each connection may have 256 bloom filter entries, which AIUI
> are slightly less expressive than match rules (one match rule maps to
> one or more match rules). The system dbus-daemon defaults to allowing
> 512 match rules, and the session to 50 000. Again, I could potentially
> see this being an issue for existing code: Alban's benchmarking for
> GNOME + Telepathy back in 2011 revealed a peak of 81 match rules,
> although admittedly some of those were due to a dbus-glib bug[2]. QtDBus
> adds more / finer-grained match rules than GDBus or dbus-glib, so it
> might get this worse. I've opened a bug with a possible mitigation:
> <https://code.google.com/p/d-bus/issues/detail?id=10>
>
> [1] http://people.collabora.com/~alban/d/2011/02/gnome-telepathy.csv
> [2] https://bugs.freedesktop.org/show_bug.cgi?id=33646
This might be a problem. Right now, QtDBus assumes that any match rule it adds
will be handled successfully. If the resource limit is low enough that an
application could hit it, we'll need to start handling the failure case.
I'm not thinking of a misbehaving application that causes runaway uses of
entries, but a legitimate one that is simply very large or causes a slow leak
of resources over an extended period of time. That would cause hard-to-debug
issues because they'd often happen in production.
> Bi-endian systems
> =================
>
> The reference implementation of kdbus' userland part, in systemd,
> specifically requires that message payloads must be in native endianness
> for simplicity.
>
> It is not clear to me what this would mean for CPU architectures that
> have runtime-switchable endianness, namely arm, powerpc and possibly
> mips. On these architectures, does Linux effectively impose the
> additional restriction that every process must run in the same
> endianness as the kernel, or what?
>
> Similarly, a big-endian per-process emulator like qemu-m68k on a
> little-endian system like x86-64 would have to either not implement the
> kdbus ioctls (resulting in emulated processes falling back to
> stream-based D-Bus), or byteswap the message. It presumably already has
> to know how to byteswap struct contents.
I agree with other replies here to punt this problem to the people who are
really affected by it. Since the kernel does not care about content, there
should be no trouble with compatibility. Only the userspace bits would need
updating.
> Message ordering
> ================
>
> The D-Bus Specification still doesn't define message ordering, which is
> a significant omission. However, dbus-daemon has always imposed a "total
> order" on messages: if two connections A and B both observe messages M1
> and M2, and A observes M1 before M2, then B cannot observe M2 before M1,
> unless it uses a library API that "jumps the queue" by using
> pseudo-blocking[3].
I'm not sure we need to keep the total ordering like you described. We can
probably relax it a bit to simply ensure causality. In the above case, you
didn't say what sent those two messages.
If M1 and M2 are sent from the same connection, then all receivers should
receive M1 and M2 in order.
If M2 is sent by a connection in response to receiving M1 in the first place,
then all receivers should also observe M1 and M2 in the same order (M1 first).
Other than that, if M1 and M2 are sent by two independent connections without
causality connection, then the order in which receivers observe it is
arbitrary.
Now, the question is how to implement that, especially given the causality
requirement. The simplest way is to have a single queue, like the dbus-daemon
currently does. I don't know the internals of kdbus to know whether it
provides this guarantee, but it'll have to.
A couple of other items to discuss:
== DBUS_xxxx_BUS_ADDRESS ==
We probably discussed this. Should we specify that the address on the
environment variable should be of the form:
kdbus:path=/sys/fs/kdbus/xxxx,uuid=<uuid from hello>[;fall back addresses]
== org.freedesktop.DBus connection ==
Will systemd-kdbus provide that name on the bus so applications that make
calls directly be able to continue working? I imagine the following methods
would be interesting to have:
org.freedesktop.DBus.GetAdtAuditSessionData
org.freedesktop.DBus.GetConnectionCredentials
org.freedesktop.DBus.GetConnectionSELinuxSecurityContext
org.freedesktop.DBus.GetConnectionUnixProcessID
org.freedesktop.DBus.GetConnectionUnixUser
org.freedesktop.DBus.GetId
org.freedesktop.DBus.GetNameOwner
org.freedesktop.DBus.ListActivatableNames
org.freedesktop.DBus.ListNames
org.freedesktop.DBus.ListQueuedOwners
org.freedesktop.DBus.NameHasOwner
org.freedesktop.DBus.ReloadConfig
org.freedesktop.DBus.StartServiceByName
org.freedesktop.DBus.UpdateActivationEnvironment
Most of those would be just convenience for other, existing kdbus low-level
calls, but ReloadConfig and UpdateActivationEnvironment are not available
anywhere else. It's true that there's nothing stopping more CAP_IPC_OWNER
connections from installing more activators, but the question is whether
systemd will provide those for the activations it holds.
== Kernel API ==
=== Custom endpoints ===
The docs say "To create a custom endpoint, use the KDBUS_CMD_ENDPOINT_MAKE
ioctl". On what file descriptor? The one for the control file? Or can it be sent
on any kdbus endpoint? I'm asking because I'm not sure what the permissions of
the control file will be -- will any process be allowed to open it and create
endpoints?
Will custom endpoints have IDs too? The documentation for the UUID is in the
bus section (5.1), not the endpoint section (5.2). Or is the UUID of the
endpoints shared with the bus endpoint's ID?
Aside from the policy rules, there's no discussion on what a custom endpoint
can do. Given that, I assume that custom endpoints are fully capable and,
therefore, can be used for custom application buses (i.e., multiple
applications, owning names, etc.).
But if that's the case, how would one implement a peer-to-peer connection? Or
should it simply be a convention that P2P connections are really regular
buses, except that no one owns any names, there are no policy restrictions and
that the only two connections are :1.1 and :1.2?
=== Unique Connection IDs ===
Are 64-bit counters without reuse enough?
Qt used to use a 32-bit (signed) counter for its timer IDs and we used to
think it was enough to always increment it and never reuse. We were wrong and
we had to implement an allocator like the PID allocator in the kernel. Now,
we're talking 33 bits more than the Qt case here, but is it possible that a
long-running bus would run out of bits? Think of a bus running for a couple of
years on a server, like Apache httpd using a bus to communicate with its
workers.
=== Multicasting ===
Another thought that comes to mind: should we reserve the entire highest bit
in connection IDs for broadcasts? It would allow for the existence of
multicast groups in the future.
This could be a neat solution for matching interface names.
=== The "1." in unique connection names ===
It's not really necessary. Just because dbus-daemon does it does not mean that
kdbus needs to. It's not necessary to satisfy the rule that all connection
names contain at least one dot since unique connection names do not pass the
validation anyway (the ":" character is not allowed).
Of course this is a simple convention, but why perpetuate the 1?
=== Activation ===
This is not really well-documented in the kdbus.txt file. Can someone expand on
it, please?
>From what I can read, service activation is performed by one or more
connections saying they're an "activator" at their Hello time. Therefore, they
will receive all messages that are directed at a given well-known name.
Supposedly, they will peek (without dropping) messages to find out what the
destination is and then launch a process that implements the service (the
implementor). When the implementor exits, all queued messages to that service
are transferred back to the activator.
How does the hand-off to the implementor happen? Is it automatic that as soon
as a connection requests a name in the bus that was previously held by the
activator, it will start receiving all queued messages?
if that is so, how does the activator read past the activation message to get
to the next one, without dropping it?
=== Matching string parts of the message ===
I'm not really clear how a connection declares its interest in matching the
interface name, object path, member name or even string arguments of the
message. The kdbus_cmd_match structure only seems to have a way of matching
the sender's name. The bloom filter is not really explained.
This is really crucial, since the most common scenario of signal matching is
to match against the interface and member names. Dropping this feature is a
really bad idea, since it would cause applications to listen to every
broadcast a given connection sends, causing unnecessary wakeups.
A secondary use-case is to match a given string in an argument, like the
current NameOwnerChanged signal really requires. I'm guessing this won't be
possible.
=== KDBUS_CMD_BYEBYE ===
The docs say that it only succeeds if there are no more messages, at which
point no further messages will be accepted. There doesn't seem to be a way of
doing a shutdown()-equivalent: stop reception of new messages but still
process the queued ones.
=== kdbus_msg timeout ===
The docs say that the timeout is expressed as a timestamp of the deadline, as
opposed to an actual timeout. I would much prefer it be provided in number of
nanoseconds to wait, since that's the normal use-case (25 seconds). To do it
the way that it's proposed would require a call to clock_gettime() and some
math before every KDBUS_CMD_MSG_SEND in order to calculate the deadline.
Can this be changed?
If this isn't to be changed, please change it at least to be a struct
timespec, so it's easier to calculate it from the output of clock_gettime().
PS: the documentation says that it's on CLOCK_MONOTONIC, but glibc does not
define _POSIX_MONOTONIC_CLOCK to be larger than zero. That implies that there
are Linux systems where no monotonic clock is present. Either kdbus or glibc
needs to be fixed.
=== KDBUS_CMD_MSG_CANCEL ===
Clarification: the docs say that this call cancels a blocking wait. Does it
mean that this command is to be used from a different thread to cause the
blocking call to return? If it has other purposes, this needs documenting.
=== KDBUS_CMD_NAME_ACQUIRE and RELEASE ===
These calls could allow multiple names to be requested or released in one go.
Low-priority future improvement, I guess.
I also don't see anything specifying how a connection is notified that it got a
name it had queued for. Should it use KDBUS_ITEM_NAME_{ADD,CHANGE} and look
for its own connection ID?
If one requests a name and allows queueing, is there any way to tell from the
return value whether the acquisition happened immediately? Not that important,
I guess, since queueing implies async anyway.
=== kdbus_cmd_match user data ===
One of the problems we have in userspace bindings is to figure out what to do
with a message that was received, especially broadcasts. It would really,
really help us if we could specify some user data in the match rule and have
that be provided by the kernel when the message is received.
A single u64 per match rule would be enough, since we can store a pointer
there (i.e., the cookie that is already there). The received message should
contain a list of user data (cookies) that it matched.
This could be a future improvement, since right now we make do without it.
=== kdbus_policy_access ===
After reading through the description, you can be sure that there will be a
request to add the ability to match a given SMACK label, not just UID or GID.
Hopefully a patch will be sent alongside it, but please make it possible to
pass labels not just IDs.
=== Wildcards ===
Are you sure that * not matching a dot is a good idea? What is the rationale
behind it?
Can I suggest using IMAP wildcard matching instead? * matches anything
including dots, % matches everything except dots.
=== KDBUS_ATTACH_NAMES ===
Documentation for metadata says that userspace must cope with some metadata
not being delivered. Can we at least require that KDBUS_ATTACH_NAMES be
delivered if requested? If the cookie in the match rule isn't provided in the
message reception, having the source's names would help solve the problem of
the signal delivery.
The timestamp should also be mandatory.
=== KDBUS_PAYLOAD_DBUS ===
The docs say it reads "DBusDBus", but on little-endian systems it actually
reads "suBDsuBD" :-)
Not relevant, though.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
More information about the systemd-devel
mailing list