[systemd-devel] offline updates

Zbigniew Jędrzejewski-Szmek zbyszek at in.waw.pl
Mon Jul 27 12:37:33 PDT 2015


On Mon, Jul 27, 2015 at 08:48:08PM +0200, Lennart Poettering wrote:
> On Tue, 21.07.15 03:27, Zbigniew Jędrzejewski-Szmek (zbyszek at in.waw.pl) wrote:
> 
> > [resending with the right systemd-devel address, sorry for that]
> > 
> > Here are some thoughts on offline updates resulting from testing
> > the new dnf fedup plugin developed by Will Woods
> > [https://github.com/wgwoods/dnf-plugin-fedup].
> > 
> > I ran an update using dnf fedup and it works (or would have worked, if
> > stuff didn't happen), which is already great for something so simple,
> > but it exposes some shortcomings in the Offline Update spec itself
> > [http://www.freedesktop.org/wiki/Software/systemd/SystemUpdates/].
> > 
> > The main issues are:
> > - what happens when multiple offline mechanisms are present
> > - how is failure handled
> > 
> > On my test system, I had packagekit-offline-update.service already
> > present when I installed the plugin and fedup-system-upgrade.service.
> > After running 'dnf fedup download ...' and 'dnf fedup reboot'
> > I saw something like this:
> 
> I appears to be quite wrong to have a distro that is update in two
> ways. Pick one. Or if you really want to two alternative
> implementations of such a thing (which I find crazy), then make them
> handle the fall-out, and ensure that one kicks out the other.
> 
> In general I would say: it would be a good idea if the upgrade tools
> would:
> 
> a) when enabling /system-update check if it exists first. If so, print
>    a warning of "uprgade is already scheduled, refusing", or so...
> 
> b) after the reboot, when initializing, make a quick check where
>    /system-update points. Become only active it it points where you
>    placed it. If it points anywhere else, assume somebody else changed
>    it, and log about this, and exit cleanly, so that no error is
>    triggered.
> 
> Both these rules appear to be generally recommended for robustness
> reasons. We should probably add this to the wiki.
OK, this matches fairly exactly what I wrote in "version 2" later in the thread:
http://lists.freedesktop.org/archives/systemd-devel/2015-July/033623.html

> > Also, which is a minor thing, but related: OnFailure=reboot.target
> > seems inferior to FailureAction=reboot. IIRC, the second one uses
> > irreversible transaction and should be more robust. It also is a
> > higher level setting in some sense.  OnFailure=reboot.target is taken
> > directly from the spec, so should be changed there first.
> 
> I think I agree.
> 
> > Also, another related issue: packagekit-offline-update.service has
> > Type=simple. (In the log above it is "started" almost immediately, so
> > system-update.target could be reached while it is still running.) This
> > should be Type=oneshot.
> 
> Probably, yes.
> 
> > It seems that failure handling is already shaky, but I think there more
> > failure modes. Let's say that 'dnf fedup upgrade' didn't work for some
> > reason (missing ConditionPathExists file, dnf installation problem, whatever).
> > Then nothing would remove the /system-update link, and we would reboot,
> > and run system-update.target again, and reboot, and run
> > system-update.target.
> 
> It figure that's a general problem: we need some scheme how we can
> count unsuccessful boots, with some form of roll-back if some limit is
> reached. But I think this is material for another discussion and needs
> support in the boot loader (there has been work to add this to
> sd-boot/gummiboot).
Ack.

> > In general, creating /system-update without a working update service
> > is enough to enter an infinite reboot loop.
> 
> Well, it's how UNIX works...
> 
> That said, if fedup wants to avoid the risk of this it might choose to
> remvoe the symlink before starting its actual work...
It removes the symlink right now when it is launched.
Yeah, that should be good enough.

> 
> > To summarize, following changes to the spec are proposed:
> > - use Condition* or similar to conditionalize whether a specific
> >   upgrade mechanism should run
> 
> I'd really recommend actually comparing the symlink target and doing
> that in the C code of the upgrade tool.
> 
> > - use Action=reboot
> > - use Type=oneshot
> 
> Both sound right.
> 
> > - check that logind.Reboot() is not called on failure by the service
> 
> i figure, too.
> 
> > - services should not look for /systemd-update symlink,
> >   and the symlink should be removed by tmpfiles before we even get to
> >   the upgrade.
> 
> I disagree, see above.

OK, I think we're pretty much in agreement. I'd like to take the opportunity
to convert the wiki page to a man page. It would be easier to discuss
and track changes then. ?

Zbyszek


More information about the systemd-devel mailing list