[systemd-devel] systemd shutdown vs ostree

Wed Jul 24 06:19:16 PDT 2013

On Sat, Jul 20, 2013 at 06:50:13PM -0400, Colin Walters wrote:
> So OSTree sets up systemd inside a chroot - /usr is a read-only bind
> mount, and /var is a bind mount outside the root to a shared location.
> Furthermore, /sysroot points to the real root.
> 
> Since last time we discussed this:
> http://lists.freedesktop.org/archives/systemd-devel/2012-September/006668.html
> I now use this service inside dracut:
> https://git.gnome.org/browse/ostree/tree/src/dracut/ostree-prepare-root.service
> Which executes:
> https://git.gnome.org/browse/ostree/tree/src/switchroot/ostree-prepare-root.c
> 
> Then finally we do dracut's normal systemctl switch-root, and everything
> continues as normal.  I haven't had to patch the systemd codebase at all
> for this.
> 
> The problem is that on shutdown, systemd will synthesize usr.mount and
> var.mount from /proc/self/mountinfo, but it can't really unmount them
> until the same point as the rootfs.  Because these units fail to
> unmount, the normal shutdown process wedges.
> 
> I can shutdown fine with systemctl --force poweroff, but then I don't
> get plymouth integration etc.
> 
> One way to fix this might be to somehow tell systemd to just ignore
> these mount points during shutdown.  Or possibly, switch back to the
> initramfs and unmount them from there.
> 
> The ugly thing about switching back to the initramfs is that it requires
> unpacking it from the cpio blob again, which requires /boot to be
> mounted, only to run a few unmount syscalls, and then finally power off.
> 
> But if there was a way to tell systemd to just ignore the mounts, then
> we'd drop into the final poweroff SIGTERM/SIGKILL/umount spree like
> sysvinit did, and things would work.
> 
> Anyone else doing bind mount tricks like this?

A while back had a similar-ish kind of problem with LXC, when the original
FS had something mounted at say /foo/bar/wizz, and then libvirt bind mounted
something at /foo, making /foo/bar/wizz inaccessible. systemd would
still see these over-mounted mounts and fail to unmount them at shutdown.
I fixed libvirt LXC to remove all sub-mounts before bind mounting the
new thing at /foo, so not sure if the problems I saw with systemd would
still exist or not.

There is also a change proposed for the kernel namespaces yesterday to
make it possible to stop a process inside a container from unmounting
things that wasn't originally mounted inside the namespace. So if that
is merged, systemd inside a container wouldn't be able to assume it
has the privileges to unmount all filesystems it can see.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|