[systemd-devel] offline updates

Lennart Poettering lennart at poettering.net
Mon Jul 27 11:48:08 PDT 2015


On Tue, 21.07.15 03:27, Zbigniew Jędrzejewski-Szmek (zbyszek at in.waw.pl) wrote:

> [resending with the right systemd-devel address, sorry for that]
> 
> Here are some thoughts on offline updates resulting from testing
> the new dnf fedup plugin developed by Will Woods
> [https://github.com/wgwoods/dnf-plugin-fedup].
> 
> I ran an update using dnf fedup and it works (or would have worked, if
> stuff didn't happen), which is already great for something so simple,
> but it exposes some shortcomings in the Offline Update spec itself
> [http://www.freedesktop.org/wiki/Software/systemd/SystemUpdates/].
> 
> The main issues are:
> - what happens when multiple offline mechanisms are present
> - how is failure handled
> 
> On my test system, I had packagekit-offline-update.service already
> present when I installed the plugin and fedup-system-upgrade.service.
> After running 'dnf fedup download ...' and 'dnf fedup reboot'
> I saw something like this:

I appears to be quite wrong to have a distro that is update in two
ways. Pick one. Or if you really want to two alternative
implementations of such a thing (which I find crazy), then make them
handle the fall-out, and ensure that one kicks out the other.

In general I would say: it would be a good idea if the upgrade tools
would:

a) when enabling /system-update check if it exists first. If so, print
   a warning of "uprgade is already scheduled, refusing", or so...

b) after the reboot, when initializing, make a quick check where
   /system-update points. Become only active it it points where you
   placed it. If it points anywhere else, assume somebody else changed
   it, and log about this, and exit cleanly, so that no error is
   triggered.

Both these rules appear to be generally recommended for robustness
reasons. We should probably add this to the wiki.

> Also, which is a minor thing, but related: OnFailure=reboot.target
> seems inferior to FailureAction=reboot. IIRC, the second one uses
> irreversible transaction and should be more robust. It also is a
> higher level setting in some sense.  OnFailure=reboot.target is taken
> directly from the spec, so should be changed there first.

I think I agree.

> Also, another related issue: packagekit-offline-update.service has
> Type=simple. (In the log above it is "started" almost immediately, so
> system-update.target could be reached while it is still running.) This
> should be Type=oneshot.

Probably, yes.

> It seems that failure handling is already shaky, but I think there more
> failure modes. Let's say that 'dnf fedup upgrade' didn't work for some
> reason (missing ConditionPathExists file, dnf installation problem, whatever).
> Then nothing would remove the /system-update link, and we would reboot,
> and run system-update.target again, and reboot, and run
> system-update.target.

It figure that's a general problem: we need some scheme how we can
count unsuccessful boots, with some form of roll-back if some limit is
reached. But I think this is material for another discussion and needs
support in the boot loader (there has been work to add this to
sd-boot/gummiboot).

> In general, creating /system-update without a working update service
> is enough to enter an infinite reboot loop.

Well, it's how UNIX works...

That said, if fedup wants to avoid the risk of this it might choose to
remvoe the symlink before starting its actual work...

> To summarize, following changes to the spec are proposed:
> - use Condition* or similar to conditionalize whether a specific
>   upgrade mechanism should run

I'd really recommend actually comparing the symlink target and doing
that in the C code of the upgrade tool.

> - use Action=reboot
> - use Type=oneshot

Both sound right.

> - check that logind.Reboot() is not called on failure by the service

i figure, too.

> - services should not look for /systemd-update symlink,
>   and the symlink should be removed by tmpfiles before we even get to
>   the upgrade.

I disagree, see above.

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list