[systemd-devel] systemd update "forgets" ordering for shutdown

Frank Steiner fsteiner-mail1 at bio.ifi.lmu.de
Mon May 18 09:48:08 UTC 2020


Hi Andrei, hi Michael,

Andrei Borzenkov wrote:

> I cannot reproduce it using trivial service definition on openSUSE Leap
> 15.1 which should have the same systemd as SLE 15 SP1. 

indeed. Maybe I have some other strange dependency in my system that
causes this problem.

> So possibilities are
> 
> 1. Something in unit definition triggers some bug in systemd. In this
> case exact full unit definition is needed. Also shutdown log with debug
> log level will certainly be useful.

How should I provide the shutdown log? I could fetch it with SOL,
or would "journalctl -b 1" with persistent loggin work?

> 2. ExecStop command is not synchronous, it forks and continues in
> background. In this case probably systemd assumes unit is stopped and
> continues. Again, log with debug log level would confirm it.

I can exclude this because I replaced the stop command with "sleep 60",
so the full unit file is as below, and still there are other units
stopped in parallel with this instead of waiting for it.

However, it's not always the same. Just calling "systemctl daemon-reexec"
and rebooting doesn't stop the unit as first one (delaying all others
as would without the reexec), but still stops it before local and
remote fs. While a reboot done after some weeks sometimes shuts down
local and remote fs before the unit stops. reboot2.txt shows the
shutdown when done after "systemctl daemon-reexec", look for
the Stopping/Stopped of "halt.local" (the "Running addon script..."
are the scripts that are executed in the stop command of that unit).

reboot1.txt was the log that I recorded some days ago when the
server had been up for 4 weeks. Here the first nfs mount is
stopped even before the "Stopping halt.local" line, and you can
also see the unmounting of the local fs /mnt/raidproj failing
because there is still a LSB process running on this fs that the
halt.local script should have shutdown.
For comparison, reboot3.txt show a working shutdown with the correct
ordering. No idea why the Stopping/Stopped lines for the halt.local
aren't visible here.

So there might be sth. else that disturbs the ordering even more than
the daemon-reexec does. Anyway, in both cases it fails in that
the unit would usually delay the shutdown of every other unit as
in reboot3.txt

cu,
Frank


Unit file:

#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.

[Unit]
Description=***** halt.local *****

# start after multi-user target => stop before multi-user-target is stopped
# => we start stop as one of the first scripts.
After=multi-user.target

[Service]
# "oneshot" doesn't harm at boot, because there is no "Before=", so we don't delay anything,
# and our startup immediately finishes.
# But on reboot, the "After=multi-user.target" turns into a "Before=multi-user.target"
# and thus indeed stops the reboot until we are done.
Type=oneshot

# the /bin/true makes sure we get up without problems and
# because of the RemainAfterExit we are considered running and
# executed during boot.
# Otherwise, e.g. if we user boot.final.bio for ExecStart, a failure
# there would prevent the execution on shutdown!
ExecStart=/bin/true
ExecStop=/usr/bin/sleep 60
# allow 20 minutes because we hold the shutdown while
# autorpm is running to avoid corrupt installations
TimeoutSec=1200

# this is neccessary to make sure we are stopped on reboot.
RemainAfterExit=yes

                                               

-- 
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. Bioinformatik    Mail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17           Phone: +49 89 2180-4049
80333 Muenchen, Germany       Fax:   +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *


More information about the systemd-devel mailing list