[systemd-devel] offline updates
Zbigniew Jędrzejewski-Szmek
zbyszek at in.waw.pl
Mon Jul 27 12:37:33 PDT 2015
On Mon, Jul 27, 2015 at 08:48:08PM +0200, Lennart Poettering wrote:
> On Tue, 21.07.15 03:27, Zbigniew Jędrzejewski-Szmek (zbyszek at in.waw.pl) wrote:
>
> > [resending with the right systemd-devel address, sorry for that]
> >
> > Here are some thoughts on offline updates resulting from testing
> > the new dnf fedup plugin developed by Will Woods
> > [https://github.com/wgwoods/dnf-plugin-fedup].
> >
> > I ran an update using dnf fedup and it works (or would have worked, if
> > stuff didn't happen), which is already great for something so simple,
> > but it exposes some shortcomings in the Offline Update spec itself
> > [http://www.freedesktop.org/wiki/Software/systemd/SystemUpdates/].
> >
> > The main issues are:
> > - what happens when multiple offline mechanisms are present
> > - how is failure handled
> >
> > On my test system, I had packagekit-offline-update.service already
> > present when I installed the plugin and fedup-system-upgrade.service.
> > After running 'dnf fedup download ...' and 'dnf fedup reboot'
> > I saw something like this:
>
> I appears to be quite wrong to have a distro that is update in two
> ways. Pick one. Or if you really want to two alternative
> implementations of such a thing (which I find crazy), then make them
> handle the fall-out, and ensure that one kicks out the other.
>
> In general I would say: it would be a good idea if the upgrade tools
> would:
>
> a) when enabling /system-update check if it exists first. If so, print
> a warning of "uprgade is already scheduled, refusing", or so...
>
> b) after the reboot, when initializing, make a quick check where
> /system-update points. Become only active it it points where you
> placed it. If it points anywhere else, assume somebody else changed
> it, and log about this, and exit cleanly, so that no error is
> triggered.
>
> Both these rules appear to be generally recommended for robustness
> reasons. We should probably add this to the wiki.
OK, this matches fairly exactly what I wrote in "version 2" later in the thread:
http://lists.freedesktop.org/archives/systemd-devel/2015-July/033623.html
> > Also, which is a minor thing, but related: OnFailure=reboot.target
> > seems inferior to FailureAction=reboot. IIRC, the second one uses
> > irreversible transaction and should be more robust. It also is a
> > higher level setting in some sense. OnFailure=reboot.target is taken
> > directly from the spec, so should be changed there first.
>
> I think I agree.
>
> > Also, another related issue: packagekit-offline-update.service has
> > Type=simple. (In the log above it is "started" almost immediately, so
> > system-update.target could be reached while it is still running.) This
> > should be Type=oneshot.
>
> Probably, yes.
>
> > It seems that failure handling is already shaky, but I think there more
> > failure modes. Let's say that 'dnf fedup upgrade' didn't work for some
> > reason (missing ConditionPathExists file, dnf installation problem, whatever).
> > Then nothing would remove the /system-update link, and we would reboot,
> > and run system-update.target again, and reboot, and run
> > system-update.target.
>
> It figure that's a general problem: we need some scheme how we can
> count unsuccessful boots, with some form of roll-back if some limit is
> reached. But I think this is material for another discussion and needs
> support in the boot loader (there has been work to add this to
> sd-boot/gummiboot).
Ack.
> > In general, creating /system-update without a working update service
> > is enough to enter an infinite reboot loop.
>
> Well, it's how UNIX works...
>
> That said, if fedup wants to avoid the risk of this it might choose to
> remvoe the symlink before starting its actual work...
It removes the symlink right now when it is launched.
Yeah, that should be good enough.
>
> > To summarize, following changes to the spec are proposed:
> > - use Condition* or similar to conditionalize whether a specific
> > upgrade mechanism should run
>
> I'd really recommend actually comparing the symlink target and doing
> that in the C code of the upgrade tool.
>
> > - use Action=reboot
> > - use Type=oneshot
>
> Both sound right.
>
> > - check that logind.Reboot() is not called on failure by the service
>
> i figure, too.
>
> > - services should not look for /systemd-update symlink,
> > and the symlink should be removed by tmpfiles before we even get to
> > the upgrade.
>
> I disagree, see above.
OK, I think we're pretty much in agreement. I'd like to take the opportunity
to convert the wiki page to a man page. It would be easier to discuss
and track changes then. ?
Zbyszek
More information about the systemd-devel
mailing list