[systemd-devel] Compatibility between D-Bus and kdbus

Tue Nov 25 16:25:18 PST 2014

On Tue, 25.11.14 12:01, Thiago Macieira (thiago at kde.org) wrote:

> > Well, we don't need any env var really, as we enforce that the UID of
> > the user is included in the name of their bussess, and the busses are
> > cleaned up when the registrar dies. We don't have the risk of leaving
> > old busses around, or even by other users, hence all code can just
> > imply the path to use is kernel:path=/sys/fs/kdbus/0-system and
> > kernel:path=/sys/fs/kdbus/$UID-user and all is good, without ever
> > having to deal with env vars at all.
> > 
> > (of course, if env-vars are set they should be used, but the normal
> > codepaths in the distros should work without them.)
> 
> Thinking of non-system buses here.
> 
> If the variable is empty, I agree that it should have an equivalent of an 
> "autostart" mechanism, but I disagree on the solution and I also disagree that 
> distros should leave it empty.

Oh, no. No autostart please. No such concept exists in kdbus, and
systemd/sd-bus will not support that either. In fact I refuse to support that
even on dbus1 in sd-bus. Autostart is a kludge for systems where dbus
is just an add-on, but that's completely out-of-focus for kdbus,
systemd and sd-bus.

Note that even on systemd we will set $DBUS_SESSION_BUS_ADDRESS,
simply because classic libdbus and gdbus won't work without
it. However, we will actually set it to a fixed value.

> For one thing, the fallback address is expected to be there if there's a proxy 
> bus running. The current autostart mechanism relies on X being present, so the 
> fallback won't be found unless X is running and something registered the 
> proxy's socket address there.
> 
> For another, it's good practice to have it set and not depend on autostart.
> 
> For a third, hardcoding kernel paths in userspace sounds like a poor idea. The 
> kdbus mountpoint may be elsewhere and whatever is creating buses may not do it 
> per user, but per session or other creation rule it may have.

No, we don't support weird setups where kdbusfs mounted
elsewhere. This is a bew API we introduce here, and we can very much
make decisions where stuff is to be mounted.

Env vars are a hack, due to the awful inheritence logic, and we should
really avoid using them, except where necessary for compat, and that's
precisely to which level we'll support them in systemd.

> > > would be interesting to have:
> > No, this is not supported in the current versions of kdbus
> > anymore. Emulation of these calls must happen client side if it shall
> > be supported.
> 
> That wouldn't be kdbus, but systemd doing it. Since systemd is the one that 
> opens the bus, it can register the first connection and claim the 
> org.freedesktop.DBus service name, providing compatibility. So this isn't a 
> feature request for kdbus but a feature request for systemd.

We initially tried to support that, but it's awfully racy, since the
driver calls and calls to other services wouldn't be executed in
strict order anymore... We removed this again after figuring out and
decided that emulation can only happen client side, synchronous to the
message stream if we want to guarantee correct ordering. 

> By the way, is there a way to ensure that a given connection is the first 
> connection? As soon as the bus creator is able to connect to the /sys/fs/kdbus 
> path, so is another process and therefore this other process could maliciously 
> acquire names it shouldn't.

When creating the bus the creator can pass policy to the kernel so
that there is no time window where the bus is accessible and open to
manipulation from untrusted clients.

> > > org.freedesktop.DBus.ReloadConfig
> > > org.freedesktop.DBus.StartServiceByName
> > > org.freedesktop.DBus.UpdateActivationEnvironment
> > > 
> > > Most of those would be just convenience for other, existing kdbus
> > > low-level
> > > calls, but ReloadConfig and UpdateActivationEnvironment are not available
> > > anywhere else. It's true that there's nothing stopping more CAP_IPC_OWNER
> > > connections from installing more activators, but the question is whether
> > > systemd will provide those for the activations it holds.
> > 
> > The client side emulation can choose to either forward ReloadConifg
> > and UpdateActivationEnvironment to the respect systemd calls, or just
> > return som "not supported" error.
> 
> Can't do that. What if it's a kdbus system that is not systemd?

Well, again, return "not supported" then. I mean, currently there is
no kdbus userspace implementation beyond kdbus, we cannot really
discuss something that doesn't exist...

> I don't mind forwarding to a well-known bus name, as long as we establish that 
> there is such a service running on the bus that will accept those calls. But 
> if such a service exists, why can't it claim the
> org.freedesktop.DBus name?

Note that on dbus1 systemd systems we actually never provided
UpdateActivationEnvironment correctly (since services got forked off
PID 1, instead of dbus-daemon but the call would alter dbus-daemon's
env block, not systemd's one), but nobody ever noticed. I really
think you should just return some "not supported" error or make it a
NOP if you don't want to pass this on to systemd.

> > if you want to create a new endpoint for an existing bus, then invoke
> > that ioctl on the bus fd. The control file after all is unrelated to
> > any bus, and thus wouldn#t know which bus you mean if we'd allow
> > invoking that ioctl on it.
> 
> Ok, so any application that connected to the "bus" bus can then create custom 
> endpoints. Correct?

You need privs (either CAP_IPC_OWNER or matching uid) for that.

> How does one get to install policies or activators on this custom bus if the 
> opening connection is a regular, non-privileged process?

the policy you can specify when you open the custom EP... (not sure I
grok the question though).

> > > But if that's the case, how would one implement a peer-to-peer connection?
> > > Or should it simply be a convention that P2P connections are really
> > > regular buses, except that no one owns any names, there are no policy
> > > restrictions and that the only two connections are :1.1 and :1.2?
> > 
> > kdbus is not for peer-to-peer connections. If you want that use
> > AF_UNIX.
> 
> Why?

What's the usecase?

I mean you can fake p2p connections by allocating a bus and only
connecting two peers to it (busses are relatively cheap now), but I am
not sure why.

> > There's really no need for peer-to-peer connections really, at least
> > performance-wise.
> 
> The need is that we can avoid loading the code that does AF_UNIX transport if 
> we detect a kdbus-capable bus. It would be nice to use kdbus for P2P
> too.

Well, I doubt the usecase for direct links.

I mean, the reasons for peer-to-peer links I am aware of are:

a) performance 
b) network transparency
c) IPC before dbus-daemon is around

a) and c) don't apply on kdbus anymore. And kdbus is inherently not a
network transport, hence you have to use AF_INET there anyway.

> Do you see any reason why we couldn't (ab)use the custom endpoints for P2P? 
> Are the unique connection IDs shared among all custom endpoints of the bus or 
> are they reset to 1?
> 
> Also, is there any way to ask an endpoint to stop accepting new connections 
> without tearing down the existing ones?

You could just take away the access bits.

> > > if that is so, how does the activator read past the activation message to
> > > get to the next one, without dropping it?
> > 
> > Why would it want that? The idea is that the activator actually
> > *never* really processes any message. It just waits for POLLIN, then
> > activates, stops listening for POLLIN, and activates the daemon which
> > then processes the messages.
> 
> Because I thought that the activator may be one process for all possible 
> services. I'm guessing this is not the way you'd envisioned it. Otherwise, if 
> you have 200 activatable services, there are 200 connections by one or more 
> process. There's no bus daemon to run out of fd's here, but they would count 
> towards the user's system-wide file descriptor limit.

Yes, systemd maintains one fd per bus-activatable name, that is
correct. And it bumps the NOFILES limits to make sure that works.

> > > The docs say that it only succeeds if there are no more messages, at which
> > > point no further messages will be accepted. There doesn't seem to be a way
> > > of doing a shutdown()-equivalent: stop reception of new messages but
> > > still process the queued ones.
> > 
> > What's the precise usecase for this?
> 
> "I've been requested to exit, so I am going to exit now" This tells the kernel 
> to stop sending me messages, so I am able to exit. If there are more after 
> this, they'll be queued for the activator again, if there's one, rejected 
> otherwise.

Well, but you could just process what you want, and not read from the
fd anymore. Then you exit, leaving the messages in the fd unread. The
kernel will then activate the process again and pass the new messages
to it. I am not really sure what the usecase is for telling the kernel
explicitly that you don't want more messages...

> KDBUS_CMD_BYEBYE seems to be something an on-demand service would use after 
> every message it receives. If the call succeeds, it exits; if it fails, it 
> parses more. But that doesn't take into account the request to exit coming 
> from the user, since it could never do that if more messages kept getting 
> received.

yeah, byebye is for exit-on-idle processes (though I'd really wait for
a while before shutting down, rather than doing that immediately after
each message...)

> A solution for that is to update the timeout with the remaining time like 
> select(2) does (though glibc hides that). This is a Linux-specific syscall, so 
> there's no POSIX compatibility to take into account.

Well, kernel folks had to hack that into select(), but they really
don#t like it, and we have to abide.

> > > PS: the documentation says that it's on CLOCK_MONOTONIC, but glibc does
> > > not
> > > define _POSIX_MONOTONIC_CLOCK to be larger than zero. That implies that
> > > there are Linux systems where no monotonic clock is present. Either kdbus
> > > or glibc needs to be fixed.
> > 
> > No, the monotonic clock is *not* optional on Linux.
> 
> Then glibc should be fixed to have _POSIX_MONOTONIC_CLOCK set to 200809L. That 
> saves us a sysconf() call to verify whether it's present or not.
> 
> http://osxr.org/glibc/source/sysdeps/unix/sysv/linux/bits/posix_opt.h#0093
> http://osxr.org/glibc/source/nptl/sysdeps/unix/sysv/linux/bits/posix_opt.h#0161
> 
> if you know someone influential there to make it happen, it would be most 
> welcome.

File a bug to glibc.

> > > === Wildcards ===
> > > 
> > > Are you sure that * not matching a dot is a good idea? What is the
> > > rationale behind it?
> > 
> > Hmm, what precisely is this about? wildcards about?
> 
> Just wondering why the * does not match the dot. I'd assume the more common 
> case is to match a full prefix and that includes match dots.

Hmm? * in what precisely? missing the context here...

> > > === KDBUS_ATTACH_NAMES ===
> > > 
> > > Documentation for metadata says that userspace must cope with some
> > > metadata
> > > not being delivered. Can we at least require that KDBUS_ATTACH_NAMES be
> > > delivered if requested? If the cookie in the match rule isn't provided in
> > > the message reception, having the source's names would help solve the
> > > problem of the signal delivery.
> > > 
> > > The timestamp should also be mandatory.
> > 
> > Yes, they are mandatory. process credentials might be suppressed
> > hwover, for example if they cannot be translated due to namespaces.
> 
> Thanks. Could you clarify in the docs?

Daniel, David? Could you add a note about this?

Lennart

-- 
Lennart Poettering, Red Hat