[systemd-devel] Docker vs PrivateTmp

Lennart Poettering lennart at poettering.net
Thu Jan 22 19:02:07 PST 2015


On Sat, 17.01.15 23:02, Lars Kellogg-Stedman (lars at redhat.com) wrote:

> See the `devicemapper` mountpoint created by Docker for the container:
> 
>     # grep devicemapper/mnt /proc/mounts
>     /dev/mapper/docker-253:6-98310-e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62
>     /var/lib/docker/devicemapper/mnt/e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62
>     ext4
>     rw,context="system_u:object_r:svirt_sandbox_file_t:s0:c261,c1018",relatime,discard,stripe=16,data=ordered
>     0 0

I am not sure why docker makes these mounts visible in the host
namespace at all. This smells like a bug.

> Watch Docker fail to destroy the container because it is unable to remove the mountpoint directory:
> 
>     Jan 17 22:43:03 pk115wp-lkellogg docker-1.4.1-dev[18239]:
>     time="2015-01-17T22:43:03-05:00" level="error" msg="Handler for DELETE
>     /containers/{name:.*} returned error: Cannot destroy container e68df3f45d61:
>     Driver devicemapper failed to remove root filesystem
>     e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62: Device is
>     Busy"

This smells as if Docker incorrectly sets the mount propagation bits
on its own mounts.

It would be good checking /proc/self/mountinfo inside and outside of
docker's own namespace, and checking how the propagation bits are set
for the individual mounts. It's a bit hard to read, but the
interesting bits are in the 7th column of that file.

In general: docker should do the equivalent of "mount --make-rslave /"
as first thing after opening its mount namespace, so that from that
point on mounts and especiall *un*mounts propagate from the host into
the container, but not vice versa.

If they do not invoke that, then the propagation will stay at
"shared", which means the mounts will appear in the host and vice
versa, which is certainly undesired.

Also, they should not use "mount --make-rprivate /", as that means
anything the host mounted will stay mounted in the container forever,
which is a problem.

Also, they really need to make this recursive, so that all mount
points they have access too are detached from the host!

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list