[systemd-devel] Failure to umount /var at shutdown
mztabzr at 0pointer.de
Thu Oct 23 02:39:38 PDT 2014
On Thu, 23.10.14 11:27, Daniele Nicolodi (daniele at grinta.net) wrote:
> I have a Debian sid system where there is a problem with the unmonting
> of the /var filesystem that causes a delay in the shutdown process:
> > ott 21 10:08:46 nautilus virtualbox: Stopping VirtualBox kernel modules.
> > ott 21 10:08:46 nautilus systemd: Received SIGRTMIN+24 from PID 28503 (kill).
> > ott 21 10:08:46 nautilus systemd: Received SIGRTMIN+24 from PID 28500 (kill).
> > ott 21 10:08:46 nautilus systemd: pam_unix(systemd-user:session): session closed for user lele
> > ott 21 10:08:46 nautilus systemd: pam_unix(systemd-user:session): session closed for user lightdm
> > ott 21 10:10:16 nautilus systemd: user at 117.service stop-sigterm timed out. Killing.
> > ott 21 10:10:16 nautilus systemd: Unit user at 117.service entered failed state.
> > ott 21 10:10:16 nautilus systemd-udevd: Network interface NamePolicy= disabled on kernel commandline, ignoring.
> > ott 21 10:10:16 nautilus networking: Deconfiguring network interfaces...done.
> > ott 21 10:10:16 nautilus lvm: 5 logical volume(s) in volume group "system" unmonitored
> > ott 21 10:10:16 nautilus umount: umount: /var: target is busy
> > ott 21 10:10:16 nautilus umount: (In some cases useful info about processes that
> > ott 21 10:10:16 nautilus umount: use the device is found by lsof(8) or fuser(1).)
> > ott 21 10:10:16 nautilus systemd: var.mount mount process exited, code=exited status=32
> > ott 21 10:10:16 nautilus systemd: Failed unmounting /var.
> > ott 21 10:10:17 nautilus systemd: Shutting down.
> > ott 21 10:10:17 nautilus systemd-journal: Journal stopped
> As you can see, the umount for /var fails because the filesystem is in
> use and this apparently makes systemd to wait for what seems to be a 90
> seconds timeout before proceeding with the shutdown.
This is journald's fault, it keeps the log files open and runs until
the very end. It's a know issue. We should fix this by synchronously
moving logging back to /run right before we want to unmount
/var. While this will make this error go away, the logs from that
point on will effectively be lost as /run is of course flushed on
The current behaviour is mostly a cosmetic problem though, as in the
final killing spree journald will be killed after all, and we will do
another unmounting round which gets rid of /var, too. Hence data loss
will not occur.
> First, how can I debug what is going on, namely how can I see which
> process is keeping /var busy? Second, where does the 90 seconds timeout
> come from? Does it make sense to wait for a timeout if the un-mounting
> of a partition fails a shutdown?
The timeout is unrelated, it's probably an indication of lost cgroup
events. We shifted around a few things about that a while back, please
make sure to check the current git version before reporting back on
this one (release is going to be soon, hopefully).
Lennart Poettering, Red Hat
More information about the systemd-devel