[systemd-devel] Docker vs PrivateTmp

Lars Kellogg-Stedman lars at redhat.com
Sat Jan 17 20:02:01 PST 2015


Hello all,

With systemd 216 on Fedora 21 (kernel 3.17.8), I have run into an odd
behavior concerning the PrivateTmp directive, and I am looking for
help identifying this as:

- Everything Is Working As Designed, Citizen
- A bug in Docker (some mount flag is being set incorrectly?)
- A bug in systemd's PrivateTmp behavior
- Something Completely Different

The TL;DR is that restarting a service with PrivateTmp=true appears to
preserve references to any mounts in the parent mount namespace that
were active at the time the service was started.  If these mounts are
later unmounted in the parent namespace, the reference persists in the
child mount namespace, which means among other things that the
mountpoint cannot be deleted ("Device or resource busy").

This seems to be approximately the same issue described in
https://bugzilla.redhat.com/show_bug.cgi?id=851970, but that bug is
two years old and closed.

Here's how I encountered the problem:

Assuming that your Docker is configured to use the `devicemapper`
storage driver, start a Docker container.  Any container will do, e.g:

    # cid=$(docker run -d larsks/thttpd)

See the `devicemapper` mountpoint created by Docker for the container:

    # grep devicemapper/mnt /proc/mounts
    /dev/mapper/docker-253:6-98310-e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62 /var/lib/docker/devicemapper/mnt/e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62 ext4 rw,context="system_u:object_r:svirt_sandbox_file_t:s0:c261,c1018",relatime,discard,stripe=16,data=ordered 0 0

Now restart a service -- any service! -- that has "PrivateTmp=true":

    # systemctl restart systemd-machined

Get the PID for that service:

    # systemctl status systemd-machined | grep PID
     Main PID: 18698 (systemd-machine

And see that the Docker "devicemapper" mount is visible inside the
mount namespace for this process:

    # grep devicemapper/mnt /proc/18698/mounts
    /dev/mapper/docker-253:6-98310-e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62 /var/lib/docker/devicemapper/mnt/e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62 ext4 rw,context="system_u:object_r:svirt_sandbox_file_t:s0:c261,c1018",relatime,discard,stripe=16,data=ordered 0 0

Attempt to destroy the container:

    # docker rm -f $cid

Watch Docker fail to destroy the container because it is unable to remove the mountpoint directory:

    Jan 17 22:43:03 pk115wp-lkellogg docker-1.4.1-dev[18239]:
    time="2015-01-17T22:43:03-05:00" level="error" msg="Handler for DELETE
    /containers/{name:.*} returned error: Cannot destroy container e68df3f45d61:
    Driver devicemapper failed to remove root filesystem
    e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62: Device is
    Busy"

Because while that mount is gone from the global namespace:

    # grep devicemapper/mnt /proc/mounts

It still exists inside the mount namespace for the service we restarted:

    # grep devicemapper/mnt /proc/18698/mounts
    /dev/mapper/docker-253:6-98310-e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62 /var/lib/docker/devicemapper/mnt/e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62 ext4 rw,context="system_u:object_r:svirt_sandbox_file_t:s0:c261,c1018",relatime,discard,stripe=16,data=ordered 0 0

The only solution is to restart the service holding these references:

   # systemctl restart systemd-machined

Now the mountpoint can be deleted.

Thanks,

-- 
Lars Kellogg-Stedman <lars at redhat.com> | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack          | http://blog.oddbit.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20150117/d6254b0f/attachment.sig>


More information about the systemd-devel mailing list