Abstract unix sockets and session socket address

Wed Dec 17 00:40:01 PST 2014

On tis, 2014-12-16 at 08:41 -0800, Thiago Macieira wrote:
> On Tuesday 16 December 2014 10:39:32 Alexander Larsson wrote:
> > I'm currently working on a desktop "app" system using container
> > technologies, and I'm running into an issue with dbus use of abstract
> > sockets. In the long run I want to do fully sandboxed apps, which
> > implies kdbus. However, at the moment I want to just use the container
> > aspect to ease deployment of apps (use a separate runtime for the app
> > and the host), and as such I want the apps to be able to talk to dbus.
> > 
> > In general, abstract sockets are a bad idea whenever namespaces are
> > involved. Abstract sockets exist in a global namespace for each network
> > namespace in use. This means that you can't have an app in its own
> > network namespace and still talk to the session bus. It also means that
> > if you're sharing the network namespace with the host there is no way to
> > disallow the app access to the session bus (or any other service on the
> > host using abstract sockets).
> 
> I'm not sure I understand you here. If you're in the same network namespace, 
> shouldn't you be allowed to access all the networking resources of that 
> namespace? Conversely, if you have a different network namespace, resources may 
> or may not be available depending on how the namespacing is done.
> 
> That said, the session bus socket is a network resource. If an app is in a 
> different namespace, it stands to reason it may not be allowed to access other 
> namespaces' resources.

You're not wrong. In the linux kernel the above is how abstract sockets
work. However, if you think of the various container facilities in the
kernel as building blocks to implement the kind of containment you want,
then this is often problematic.

For instance, you commonly want to limit an application so that it can't
send or receive anything on the network (and by this you generally mean
"the internet"). The way you do this is to create a new network
namespace and then set up that however you want (only loopback, bridge
to host, NAT, firewall setup, etc).

However, if you do this you are automatically limiting your ability to
use abstract sockets to talk to the host. One casualty is dbus which on
linux *always* uses abstract sockets for "unix:tmpdir=" addresses such
as the default session bus. So, there is *no* way in the kernel to limit
the network access and yet allow you to talk to the current session bus
default.

Conversely, you may want to allow full network access to an app, but
*not* allow it access to the global namespace of all abstract sockets
(with no permissions checks). But this is also impossible.

So, basically, if you want to play nicely with contained applications,
using abstract socket addresses is likely to get in your way.

> > Regular non-abstract sockets are a much better fit for this. Since they
> > exist in the regular filesystem tree they are naturally namespaced via
> > the filesystem namespace, and you can easily "transplant" any particular
> > socket from one namespace to the other using things like bind mounts. It
> > also allows filesystem permission checks on the sockets.
> 
> Unless you have a different filesystem namespace, in which case it will break. I 
> don't see how one is different from the other here.

This however is not true. If you are in a different filesystem namespace
you can still access a regular unix socket, as long as you can access
the file and it has the right file permissions. When you create a new
filesystem namespace you get your own *view* of the files, but they are
still there. You can still access all the same unix sockets. 

The natural next step of the setup is to modify the new namespace,
setting up new mounts and at the end pivot-rooting into a new root.
During this setup you can chose exactly which unix sockets from the host
should be visible in the app fs namespace, by using bind-mounts. This
makes filesystem-based sockets much more flexible when doing various
kinds of containment.

So, if the session bus is a regular socket then its up to the container
setup whether the app has access to it or not, and this is completely
independent on the network isolation you chose for the app.