[systemd-devel] Why does nspawn need two child processes?

Lennart Poettering mzxreary at 0pointer.de
Wed Jun 7 08:04:49 UTC 2017

On Wed, 31.05.17 20:40, Luke Shumaker (lukeshu at lukeshu.com) wrote:

> So my question becomes: what has to be done *after* unsharing the
> mount namespace, but *before* unsharing the PID namespace?

The various types of namespaces are not orthogonal even if they are
exposed in supposedly independent bits in the clone() flags parameter:
if a new namespace (in particular a file system namespace CLONE_NEWNS
and a PID namespace CLONE_NEWPID) is created at the same time as a
CLONE_USER user namespace, then those namespaces will be "owned" by
the user namespace. That has various effects, in particular on who may
mount/umount mount points in that namespace and on what is exposed in
/proc. There are some mounts we never want the host to see, but which
also shall not be able to be modified by the container itself, for
example the container's root directory (which is mounted to a
temporary subdirectory of /tmp), hence we do it in a new file system
namespace that is not the host's, but also not the container's but
inherited into it: i.e. between the two CLONE_NEWNS.

I hope that makes sense?


Lennart Poettering, Red Hat

More information about the systemd-devel mailing list