[systemd-devel] Docker vs PrivateTmp

Lars Kellogg-Stedman lars at redhat.com
Mon Jan 19 08:33:42 PST 2015


On Sat, Jan 17, 2015 at 11:02:01PM -0500, Lars Kellogg-Stedman wrote:
> The TL;DR is that restarting a service with PrivateTmp=true appears to
> preserve references to any mounts in the parent mount namespace that
> were active at the time the service was started.  If these mounts are
> later unmounted in the parent namespace, the reference persists in the
> child mount namespace, which means among other things that the
> mountpoint cannot be deleted ("Device or resource busy")...

While I think we've probably identified the solution, I'm still trying
to understand how we get into this situation in the first place.

With neither `MountFlags` nor `PrivateTmp` specified in my docker.service,
starting a container results in the following mount visible in the global mount
namespace:

    global# grep /mnt /proc/self/mountinfo
    685 433 253:22 / /var/lib/docker/devicemapper/mnt/297bf7ae64bd5cf552b45b098b22df85a49deeadb2d71b330e2f866dac95a448 rw,relatime - ext4 /dev/mapper/docker-253:6-98310-297bf7ae64bd5cf552b45b098b22df85a49deeadb2d71b330e2f866dac95a448 rw,context="system_u:object_r:svirt_sandbox_file_t:s0:c138,c268",discard,stripe=16,data=ordered

If I create a new mount namespace (as a child of the global namespace) with
`unshare -m`, I can as expected see the same mount:

    unshare# grep /mnt /proc/self/mountinfo
    805 804 253:22 / /var/lib/docker/devicemapper/mnt/297bf7ae64bd5cf552b45b098b22df85a49deeadb2d71b330e2f866dac95a448 rw,relatime - ext4 /dev/mapper/docker-253:6-98310-297bf7ae64bd5cf552b45b098b22df85a49deeadb2d71b330e2f866dac95a448 rw,context="system_u:object_r:svirt_sandbox_file_t:s0:c138,c268",discard,stripe=16,data=ordered

If I attempt to stop that container, the mount disappears from the global
namespace:

    global# grep /mnt /proc/self/mountinfo
    global#

But is still visible in the mount namespace I created with unshare:

    unshare# grep /mnt /proc/self/mountinfo 
    805 804 253:22 / /var/lib/docker/devicemapper/mnt/297bf7ae64bd5cf552b45b098b22df85a49deeadb2d71b330e2f866dac95a448 rw,relatime - ext4 /dev/mapper/docker-253:6-98310-297bf7ae64bd5cf552b45b098b22df85a49deeadb2d71b330e2f866dac95a448 rw,context="system_u:object_r:svirt_sandbox_file_t:s0:c138,c268",discard,stripe=16,data=ordered

What is causing this behavior? I have tried to replicate it by hand through a
combination of mount and unshare, and the only way I can get a mount to persist
in the unshare namespace after being unmounted in the global namespace is by
explicitly calling mount `--make-rprivate /` *inside* the unshare namespace, which
is obviously not happening in the above Docker example.

Thanks,

-- 
Lars Kellogg-Stedman <lars at redhat.com> | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack          | http://blog.oddbit.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20150119/280e0691/attachment.sig>


More information about the systemd-devel mailing list