[systemd-devel] the need for a discoverable sub-volumes specification

Topi Miettinen toiwoton at gmail.com
Tue Nov 9 17:48:43 UTC 2021


On 8.11.2021 17.32, Lennart Poettering wrote:
> Besides the GPT auto-discovery where versioning is implemented the way
> I mentioned, there's also the sd-boot boot loader which does roughly
> the same kind of OS versioning with the boot entries it discovers. So
> right now, you can already chose whether:
> 
> 1. you want to do OS versioning on the boot loader entry level: name
>     your EFI binary fooos-0.1.efi (or fooos-0.1.conf, as defined by the
>     boot loader spec) and similar and the boot loader automatically
>     picks it up, makes sense of it and boots the newest version
>     installed.
> 
> 2. you want to do OS versioning on the GPT partition table level: name
>     your partitions "fooos-0.1" and similar, with the right GPT type,
>     and tools such as systemd-nspawn, systemd-dissect, portable
>     services, RootImage= in service unit files all will be able to
>     automatically pick the newest version of the OS among the ones in
>     the image.
> 
> and now:
> 
> 3. If we implement what I proprose above then you could do OS version
>     on the file system level too.
> 
> (Or you could do a combination of the above, if you want — which is
> highly desirable I think in case you want a universal image that can
> boot on bare metal and in nspawn in a nice versioned way.)
> 
> Now, in sd-boot's versioning logic we implement an automatic boot
> assesment logic on top of the OS versioning: if you add a "+x-y"
> string into the boot entry name we use it as x=tries-left and
> y=tries-done counters. i.e. fooos-0.1+3-0.efi is semantically the same
> as fooos-0.1.efi, except that there are 3 attempts left and 0 done
> yet. On each boot attempt the boot loader decreases x and increases
> y. i.e. fooos-0.1+3-0.efi → fooos-0.1+2-1.efi → fooos-0.1+1-2.efi →
> fooos-0.1+0-3.efi. If a boot succeeds the two counters are dropped
> from the filename, i.e. → fooos-0.1.efi.
> 
> For details see: https://systemd.io/AUTOMATIC_BOOT_ASSESSMENT.
> 
> Now, why am I mentioning all this? Right now this assessment counter
> logic is only implemented for the OS versioning as implemented by
> sd-boot. But I think it would make a ton of sense to implement the
> same scheme for the GPT partition table OS versioning, and then also
> for the fs-level OS versioning as proposed in this thread.
> 
> Or to say this explicitly: we could define the spec to say that if
> we encounter:
> 
>     /@auto/root-x86-64:fedora_36.0+3-0
> 
> on first boot attempt we'd rename it:
> 
>     /@auto/root-x86-64:fedora_36.0+2-1
> 
> and so on. Until boot succeeds in which case we'd rename it:
> 
>     /@auto/root-x86-64:fedora_36.0
> 
> i.e. we'd drop the counting suffix.

Could we have this automatic versioning scheme extended also to service 
RootImages & RootDirectories as well? If the automatic versioning was 
also extended to services, we could have A/B testing also for RootImages 
with automatic fallback to last known good working version.

In my setup, all services use either a RootImage= or RootDirectory= (for 
early boot services). Most of them don't care about kernel version, so 
the services use a shared drop-in (LVM logical volume 'levy'):

[Service]
RootImage=/dev/levy/%p-all.squashfs

The device path will then be for example 
/dev/levy/systemd-networkd-all.squashfs.

For udev and systemd-modules, kernel version is used 
(/usr/local/lib/rootimages/systemd-udevd-5.14.0-2-amd64.dir), so the 
services use this drop-in:

[Service]
RootDirectory=/usr/local/lib/rootimages/%p-%v.dir

Instead of (or in addition to) /@auto/ paths inside the RootImage= / 
RootDirectory=, the version could be available as modifier to part of 
device or directory pathname, for example:

[Service]
RootImage=/dev/levy/%p-all- at auto.squashfs

or

[Service]
RootImage=/usr/local/lib/rootimages/%p-%v- at auto.squashfs

Maybe %a instead of @auto.

This would then match 
/dev/levy/systemd-networkd-all-2021-11.09.0.squashfs as the highest 
version, but if that refuses to start, PID1 would try to start 
/dev/levy/systemd-networkd-all-2021-11.08.2.squashfs instead.

-Topi


More information about the systemd-devel mailing list