Compatibility between D-Bus and kdbus

Tue Nov 25 08:11:36 PST 2014

On Mon, 24.11.14 18:40, Thiago Macieira (thiago at kde.org) wrote:

> I'm wondering if the same solution should be applied to the session bus. That 
> would have the unfortunate effect that applications that aren't ported to know 
> about kdbus will always fallback to proxy functionality. It would be 
> unfortunate because the number of applications that need policy decisions on 
> the session bust must be asymptotically close to zero.

I figure this is up to the library implementation. I'd probably
simplify this and avoid duplicating the session bus name too.

> > Session bus access-control policy
> > =================================
> > 
> > In principle, people could configure the session bus to do the same
> > elaborate access-control as the system bus. In practice, this is not a
> > particularly useful thing to do, because there are many ways for
> > processes running under the same uid to subvert each other, particularly
> > if a LSM like SELinux or AppArmor is not used.
> > 
> > kdbus does not appear to make any attempt to protect a uid from itself:
> > the uid that created a bus is considered to be privileged on that bus. I
> > assume this means that the intention is that app sandboxing will use a
> > separate Unix uid, like it does on Android?
> > 
> > Unless there's an outcry from people who like LSMs, I'm inclined to say
> > that protecting same-uid session processes from each other is doomed to
> > failure, and hence that it's OK for DBUS_BUS_SESSION to connect to kdbus
> > without special precautions.
> 
> I don't understand this domain enough to be able to offer an opinion. I know 
> that Tizen will want SMACK security applied even between processes of the same 
> UID. I just don't know whether that maps to what Lennart said about
> "labels".

SMACK and SELinux will have the chance to make stricter decisions
thang the baseline policy. The SMACK folks have posted patches that
add the right hooks to kdbus to make this possible.

> That's over 25% of the limit. Can this be made runtime-configurable?

Yes, that's the plan. THough currently they are compiled in.

I have now increased this to 1K.

> > In kdbus, each connection may own up to 64 well-known names; the system
> > dbus-daemon defaults to 512, and the session to 50 000. 64 is *probably*
> > enough, but I could potentially see this becoming an issue for services
> > that allocate one well-known name per $app_specific_thing, like
> > Telepathy (one name per Connection).
> 
> Also be made configurable, but please raise the default to 256 or
> 512.

Same, also increased now. To 256.

https://code.google.com/p/d-bus/source/detail?r=20ce3cfa9f65fc6a0be052ec64d9d796626f6630

> A couple of other items to discuss:
> 
> == DBUS_xxxx_BUS_ADDRESS ==
> 
> We probably discussed this. Should we specify that the address on the 
> environment variable should be of the form:
> 
>  kdbus:path=/sys/fs/kdbus/xxxx,uuid=<uuid from hello>[;fall back addresses]

Well, we don't need any env var really, as we enforce that the UID of
the user is included in the name of their bussess, and the busses are
cleaned up when the registrar dies. We don't have the risk of leaving
old busses around, or even by other users, hence all code can just
imply the path to use is kernel:path=/sys/fs/kdbus/0-system and
kernel:path=/sys/fs/kdbus/$UID-user and all is good, without ever
having to deal with env vars at all.

(of course, if env-vars are set they should be used, but the normal
codepaths in the distros should work without them.)

> == org.freedesktop.DBus connection ==
> 
> Will systemd-kdbus provide that name on the bus so applications that make 
> calls directly be able to continue working? I imagine the following methods 
> would be interesting to have:

No, this is not supported in the current versions of kdbus
anymore. Emulation of these calls must happen client side if it shall
be supported.

> org.freedesktop.DBus.GetAdtAuditSessionData
> org.freedesktop.DBus.GetConnectionCredentials
> org.freedesktop.DBus.GetConnectionSELinuxSecurityContext
> org.freedesktop.DBus.GetConnectionUnixProcessID
> org.freedesktop.DBus.GetConnectionUnixUser
> org.freedesktop.DBus.GetId
> org.freedesktop.DBus.GetNameOwner
> org.freedesktop.DBus.ListActivatableNames
> org.freedesktop.DBus.ListNames
> org.freedesktop.DBus.ListQueuedOwners
> org.freedesktop.DBus.NameHasOwner
> org.freedesktop.DBus.ReloadConfig
> org.freedesktop.DBus.StartServiceByName
> org.freedesktop.DBus.UpdateActivationEnvironment
> 
> Most of those would be just convenience for other, existing kdbus low-level 
> calls, but ReloadConfig and UpdateActivationEnvironment are not available 
> anywhere else. It's true that there's nothing stopping more CAP_IPC_OWNER 
> connections from installing more activators, but the question is whether 
> systemd will provide those for the activations it holds.

The client side emulation can choose to either forward ReloadConifg
and UpdateActivationEnvironment to the respect systemd calls, or just
return som "not supported" error.

> == Kernel API ==
> === Custom endpoints ===
> 
> The docs say "To create a custom endpoint, use the KDBUS_CMD_ENDPOINT_MAKE 
> ioctl". On what file descriptor? The one for the control file? Or can it be sent 
> on any kdbus endpoint? I'm asking because I'm not sure what the permissions of 
> the control file will be -- will any process be allowed to open it and create 
> endpoints?

if you want to create a new endpoint for an existing bus, then invoke
that ioctl on the bus fd. The control file after all is unrelated to
any bus, and thus wouldn#t know which bus you mean if we'd allow
invoking that ioctl on it.

> Will custom endpoints have IDs too? The documentation for the UUID is in the 
> bus section (5.1), not the endpoint section (5.2). Or is the UUID of the 
> endpoints shared with the bus endpoint's ID?

endpoints are really just a way to apply additional policy, they
aren't distuingishable from the other bus endpoints in any other
way. The bus UUID hence is the same for all its endpoints.

> Aside from the policy rules, there's no discussion on what a custom endpoint 
> can do. Given that, I assume that custom endpoints are fully capable and, 
> therefore, can be used for custom application buses (i.e., multiple 
> applications, owning names, etc.). 

end points are really just about policy, and they are otherwise
equivalent to the default end point.

To stress this: the "default" end point is actually completely the
same as any "custom" end points, except that it exists by default and
is initialized to a pretty liberal policy by default. 

> But if that's the case, how would one implement a peer-to-peer connection? Or 
> should it simply be a convention that P2P connections are really regular 
> buses, except that no one owns any names, there are no policy restrictions and 
> that the only two connections are :1.1 and :1.2?

kdbus is not for peer-to-peer connections. If you want that use
AF_UNIX.

There's really no need for peer-to-peer connections really, at least
performance-wise. 

> === Unique Connection IDs ===
> 
> Are 64-bit counters without reuse enough?

Yes, they are.

If you allocate one new unique 64bit ID every 1us, you still have 1624years
before an overrun.

> Qt used to use a 32-bit (signed) counter for its timer IDs and we used to 
> think it was enough to always increment it and never reuse. We were wrong and 
> we had to implement an allocator like the PID allocator in the kernel. Now, 
> we're talking 33 bits more than the Qt case here, but is it possible that a 
> long-running bus would run out of bits? Think of a bus running for a couple of 
> years on a server, like Apache httpd using a bus to communicate with its 
> workers.

Yeah, if you allocate one new unique 32bit ID every 1us, you are
depleted after less than 2h...

> === The "1." in unique connection names ===
> 
> It's not really necessary. Just because dbus-daemon does it does not mean that 
> kdbus needs to. It's not necessary to satisfy the rule that all connection 
> names contain at least one dot since unique connection names do not pass the 
> validation anyway (the ":" character is not allowed).
> 
> Of course this is a simple convention, but why perpetuate the 1?

well, we should generate names following the same naming scheme as
dbus1 does. Otherwise implementing the compat proxy is nasty.

I really don't see much of a proble with the weird ":1." prefix. It's
just a prefix, that is all...

> === Activation ===
> 
> This is not really well-documented in the kdbus.txt file. Can someone expand on 
> it, please?
> 
> From what I can read, service activation is performed by one or more
> connections saying they're an "activator" at their Hello
> time. Therefore, they will receive all messages that are directed at
> a given well-known name.  Supposedly, they will peek (without
> dropping) messages to find out what the > destination is and then
> launch a process that implements the service (the implementor). When
> the implementor exits, all queued messages to that service are
> transferred back to the activator.

Correct.

> How does the hand-off to the implementor happen? Is it automatic that as soon 
> as a connection requests a name in the bus that was previously held by the 
> activator, it will start receiving all queued messages?

Correct.

> if that is so, how does the activator read past the activation message to get 
> to the next one, without dropping it?

Why would it want that? The idea is that the activator actually
*never* really processes any message. It just waits for POLLIN, then
activates, stops listening for POLLIN, and activates the daemon which
then processes the messages. 

Now, more recent kdbus versions have the functionality to peek the
first queued message, which is useful so that the activator logic can
log which client causes a service activation to be triggred. In this
case the message really needs to stay queued though, all the activator
should query is the metadata of the message in order to show a pretty
message.

> === Matching string parts of the message ===
> 
> I'm not really clear how a connection declares its interest in matching the 
> interface name, object path, member name or even string arguments of the 
> message. The kdbus_cmd_match structure only seems to have a way of matching 
> the sender's name. The bloom filter is not really explained.

Well, the reason that this is not documented in kdbus.txt is really
that the kernel doesn't care about the bloom filter much. All it does
is ultimately make an AND check of the bitfields, and that's about
it. How the filter is calculated and what is included in it is
completely up to userspace. Or in other words: the bloom filter
calculation should be documented in the dbus spec, not in the kdbus
docs in the kernel.

Unfortunately that part of the dbus spec is not written yet.

This excerpt shows you what we propose to include in the bloom filter
currently:

http://cgit.freedesktop.org/systemd/systemd/tree/src/libsystemd/sd-bus/bus-kernel.c#n137

it includes the message type, interface and member names among others.

> This is really crucial, since the most common scenario of signal matching is 
> to match against the interface and member names. Dropping this feature is a 
> really bad idea, since it would cause applications to listen to every 
> broadcast a given connection sends, causing unnecessary wakeups.
> 
> A secondary use-case is to match a given string in an argument, like the 
> current NameOwnerChanged signal really requires. I'm guessing this won't be 
> possible.

String and object path parameters are included int he bloom filter,
hence this *is* supported.

That said, NameOwnerChanged is actually a special kernel message, and
it doesn't use the bloom filter stuff hence, instead you can have
explicit matches. The Bloom filter stuff only applies to normal
userspace payload messages.

> === KDBUS_CMD_BYEBYE ===
> 
> The docs say that it only succeeds if there are no more messages, at which 
> point no further messages will be accepted. There doesn't seem to be a way of 
> doing a shutdown()-equivalent: stop reception of new messages but still 
> process the queued ones.

What's the precise usecase for this?

> 
> === kdbus_msg timeout ===
> 
> The docs say that the timeout is expressed as a timestamp of the deadline, as 
> opposed to an actual timeout. I would much prefer it be provided in number of 
> nanoseconds to wait, since that's the normal use-case (25 seconds). To do it 
> the way that it's proposed would require a call to clock_gettime() and some 
> math before every KDBUS_CMD_MSG_SEND in order to calculate the deadline.
> 
> Can this be changed?

No. This was actually recently changed only. The reason here is about
restartable syscalls: when the blocking method call ioctl() is used,
and a signal is received, clients must be able to restart the ioctl()
where it left of, and for that relative timestamps are really awful,
as the client side could not just invoke the syscall with the same
args again. 

It's actually a strict rule now for userspace interfaces of the
kernel: all timeouts should be absolute, not relative, and we need to
follow here, too.

> If this isn't to be changed, please change it at least to be a struct 
> timespec, so it's easier to calculate it from the output of
> clock_gettime().

Conversion is trivial actually...

> PS: the documentation says that it's on CLOCK_MONOTONIC, but glibc does not 
> define _POSIX_MONOTONIC_CLOCK to be larger than zero. That implies that there 
> are Linux systems where no monotonic clock is present. Either kdbus or glibc 
> needs to be fixed.

No, the monotonic clock is *not* optional on Linux.

> === KDBUS_CMD_MSG_CANCEL ===
> 
> Clarification: the docs say that this call cancels a blocking wait. Does it 
> mean that this command is to be used from a different thread to cause the 
> blocking call to return? If it has other purposes, this needs
> documenting.

Yes, that's the usecase.

> === KDBUS_CMD_NAME_ACQUIRE and RELEASE ===
> 
> These calls could allow multiple names to be requested or released in one go. 
> Low-priority future improvement, I guess.
> 
> I also don't see anything specifying how a connection is notified that it got a 
> name it had queued for. Should it use KDBUS_ITEM_NAME_{ADD,CHANGE} and look 
> for its own connection ID?

Yes. It's basically like watching NameOwnerChanged on old dbus1. 

> If one requests a name and allows queueing, is there any way to tell from the 
> return value whether the acquisition happened immediately? Not that important, 
> I guess, since queueing implies async anyway.

Yes, the flags field of the struct will have the KDBUS_NAME_IN_QUEUE
bit set if the name could not be acquired right-away.

> === kdbus_cmd_match user data ===
> 
> One of the problems we have in userspace bindings is to figure out what to do 
> with a message that was received, especially broadcasts. It would really, 
> really help us if we could specify some user data in the match rule and have 
> that be provided by the kernel when the message is received.
> 
> A single u64 per match rule would be enough, since we can store a pointer 
> there (i.e., the cookie that is already there). The received message should 
> contain a list of user data (cookies) that it matched.
> 
> This could be a future improvement, since right now we make do
> without it.

Well, this has been requested before, but this has problems. For
starters this would mean that each reciever would have to recieve a
different message, which we currently try to avoid, everybdoy gets the
same. Also, it means that the kernel would always have to iterate
through all rules that are installed, instead of being able to return
quickly if the first rule that matches is found (why? because there
might be two rules that match the same message).

Also not that bloom filters are probabilistic anyway, hence you have
to match in userspace anyway, in order not to get false positives. But
if you do that you can just build the matching data structure so that
you also use it to find the appropriate binding for each
message. sd-bus does that actuallly pretty neatly by building a
decision tree that solves both problems with the minimal number of
checks traversing a decision tree.

> === kdbus_policy_access ===
> 
> After reading through the description, you can be sure that there will be a 
> request to add the ability to match a given SMACK label, not just UID or GID. 
> Hopefully a patch will be sent alongside it, but please make it possible to 
> pass labels not just IDs.

Yes, the security label can be passed as metadata.

> === Wildcards ===
> 
> Are you sure that * not matching a dot is a good idea? What is the rationale 
> behind it?

Hmm, what precisely is this about? wildcards about?

> Can I suggest using IMAP wildcard matching instead? * matches anything 
> including dots, % matches everything except dots.
> 
> === KDBUS_ATTACH_NAMES ===
> 
> Documentation for metadata says that userspace must cope with some metadata 
> not being delivered. Can we at least require that KDBUS_ATTACH_NAMES be 
> delivered if requested? If the cookie in the match rule isn't provided in the 
> message reception, having the source's names would help solve the problem of 
> the signal delivery.
> 
> The timestamp should also be mandatory.

Yes, they are mandatory. process credentials might be suppressed
hwover, for example if they cannot be translated due to namespaces.

> === KDBUS_PAYLOAD_DBUS ===
> 
> The docs say it reads "DBusDBus", but on little-endian systems it actually 
> reads "suBDsuBD" :-)

Yes, but it still looks good in the C code...

Lennart

-- 
Lennart Poettering, Red Hat