[systemd-devel] [systemd-commits] units/basic.target units/poweroff.target units/reboot.target

Zbigniew Jędrzejewski-Szmek zbyszek at in.waw.pl
Mon Nov 10 19:48:31 PST 2014


On Mon, Nov 10, 2014 at 10:53:46PM +0100, Lennart Poettering wrote:
> On Thu, 06.11.14 14:44, Zbigniew Jędrzejewski-Szmek (zbyszek at in.waw.pl) wrote:
> 
> > On Thu, Nov 06, 2014 at 02:28:12PM +0100, Lennart Poettering wrote:
> > > On Thu, 06.11.14 12:45, Patrick Häcker (pat_h at web.de) wrote:
> > > 
> > > > > > However, this one appears bogus to me. Is there any such software
> > > > > > around that really does this? And if so, this really appears weird to
> > > > > > me to support. Delaying shutdown for more than 30min is just wrong.
> > > > > Isn't this what the various "download updates and reboot" gnome-y
> > > > > things are doing?
> > > > At least unattended-upgrades from Debian/Ubuntu/... can be configured to 
> > > > install updates on shutdown (without any special mode or something). And, 
> > > > yes, this can run for more than 30 minutes, which I could already observe in 
> > > > its default mode (installing during normal system activities), so I see no 
> > > > reason why this should not happen when configured to install during shutdown. 
> > > > The reason is, that unattended-upgrades can basically update the whole 
> > > > distribution to the next version, which naturally can take a lot of time.
> > > > 
> > > > It's questionable if this is a sane setup, but I can think of setups where 
> > > > this might be useful, e.g. having two identically configured servers for 
> > > > redundancy reasons where one server would be enough. Then it might make sense 
> > > > to update one system during shutdown while the other one takes over. This has 
> > > > the advantage, that normally running servers either have the old or the new 
> > > > state, but never some intermediate state during the update. The shutdown time 
> > > > does not really matter in this case and a watchdog killing the system 
> > > > wouldn't be welcome. But all in all this seems like an exotic use
> > > > case.
> > > 
> > > Is "unattended-upgrades" a package of its own? If so, I'd probably ask
> > > the packagers to include drop-ins for reboot.target to override the
> > > timeout. That way, as soon as you install it the shutdown timeouts are
> > > disabled.
> >
> > That is suboptimal. There really should be a way to this dynamically, like saying:
> > I'm a log-running job, I need more time, but everything is still fine. This
> > type of status should require periodical pings, watchdog style. Let's say that
> > the backup job run during shutdown hangs because there's no network, we want
> > to shutdown at some point anyway.
> 
> So, we always had per-unit timeouts in place, and they are opt-out
> (with the exception of Type=oneshot services where they are
> opt-in).  Hence adding a second level of opt-out timeouts doesn't
> sound particularly attractive to me.
Agreed.
 
> The reason I added the system-wide startup/shutdown timeouts was
> really to be a safety net, so that the individual per-unit timeouts
> and the opted-out exceptions don't add up beyond bounds.
I guess that this is part of the issue: it is hard to define what
"without bounds" means. A fsck, selinux relabel, package
installation and probably many other things are effectively unbounded.
And they might happen together at the same boot. So any kind of
fixed limit is unlikely to work in the general case.

[snip Yoga case]
Sure, it solves this specific problem, but it causes significant
problems in other configurations. It seems that we're trying to solve
the problem in the wrong place. Even with the current JobTimeout
configured for basic.target there's a big window of opportunity for
the system to hang before systemd-logind.service is
started. systemd-logind.service has After=nss-user-lookup.target, and
I can image things going wrong there, especially with custom
configurations. It would be nice if the guard we put in place would
cover this too.

> Now, the question is what we can do now about this:
> 
> a) we could move logind into early boot. This has multiple problems
>    though: it would need to track system state as gettys on other ttys
>    should only be started in multi-user mode, not in early boot. Also,
>    the behaviour would probably not be ideal: i think it would be
>    preferable if the system shuts down rather then suspend if we hang
>    during boot.
> 
> b) specifically do something about LUKS prompt timeouts: when a very
>    long timeout is hit for essential devices we could simply turn off
>    the machine again. This would fix my immediate problem, but I am
>    not sure I like it too much, I think other hangs should really be
>    covered too...
> 
> c) we can come up with a scheme that explicitly excludes fsck, selinux
>    relabel and so on from the overall-timeout. Sounds messy and
>    non-obvious given that they all have individual timeouts
>    anyway... Two layers of opting out of timeouts sounds suspicious?
No good ideas so far. But whatever we do, I think we should treat
portable and non-portable devices differently. The trade-offs are
simply different.  Otherwise, we could simply make this opt-in. After
all the designing the power-button so that it can be pushed
accidentally is special feat of design that does not happen too
often. (*)

Zbyszek

(*) I remember one server machine with the power button in the wrong
place which ended with the owner taking a screwdriver and ripping the
damn thing out.


More information about the systemd-devel mailing list