[systemd-devel] the need for a discoverable sub-volumes specification

Ludwig Nussel ludwig.nussel at suse.de
Tue Dec 21 13:57:17 UTC 2021


Chris Murphy wrote:
> On Tue, Nov 9, 2021 at 8:48 AM Ludwig Nussel <ludwig.nussel at suse.de> wrote:
>> Lennart Poettering wrote:
>>> Or to say this explicitly: we could define the spec to say that if
>>> we encounter:
>>>
>>>    /@auto/root-x86-64:fedora_36.0+3-0
>>>
>>> on first boot attempt we'd rename it:
>>>
>>>    /@auto/root-x86-64:fedora_36.0+2-1
>>>
>>> and so on. Until boot succeeds in which case we'd rename it:
>>>
>>>    /@auto/root-x86-64:fedora_36.0
>>>
>>> i.e. we'd drop the counting suffix.
>>
>> Thanks for the explanation and pointer!
>>
>> Need to think aloud a bit :-)
>>
>> That method basically works for systems with read-only root. Ie where
>> the next OS to boot is in a separate snapshot, eg MicroOS.
>> A traditional system with rw / on btrfs would stay on the same subvolume
>> though. Ie the "root-x86-64:fedora_36.0" volume in the example. In
>> openSUSE package installation automatically leads to ro snapshot
>> creation. In order to fit in I suppose those could then be named eg.
>> "root-x86-64:fedora_36.N+0" with increasing N. Due to the +0 the
>> subvolume would never be booted.
> 
> Yeah the N+0 subvolumes could be read-only snapshots, their purpose is
> only to be used as an immutable checkpoint from which to produce
> derivatives, read-write subvolumes. But what about the case of being
> in a preboot environment, and have no way (yet) to rename or create a
> new snapshot to boot, and you need to boot one of these read-only
> snapshots? What if the bootloader was smart enough to add the proper
> volatile overlay arrangement anytime an N+0 subvolume is chosen for
> boot? Is that plausible and useful?

The initrd would have to make those arrangements. AFAICT so far
openSUSE systems just boot into such a RO environment without any
preparations. So fully read-only, just enough to run snapper to create a
usable snapshot again.

>> Anyway, let's assume the ro case and both efi partition and btrfs volume
>> use this scheme. That means each time some packages are updated we get a
>> new subvolume. After reboot the initrd in the efi partition would try to
>> boot that new subvolume. If it reaches systemd-bless-boot.service the
>> new subvolume becomes the default for the future.
>>
>> So far so good. What if I discover later that something went wrong
>> though? Some convenience tooling to mark the current version bad again
>> would be needed.
>>
>> But then having Tumbleweed in mind it needs some capability to boot any
>> old snapshot anyway. I guess the solution here would be to just always
>> generate a bootloader entry, independent of whether a kernel was
>> included in an update. Each entry would then have to specify kernel,
>> initrd and the root subvolume to use.
> 
> The part I'm having a hard time separating is the implicit case (use
> some logic to assemble the correct objects), versus explicit (the
> bootloader snippet points to a root and the root contains an fstab -
> nothing about assembly is assumed). And should both paradigms exist
> concurrently in an installed system, and how to deconflict?

Not sure there is a conflict. The discovery logic is well defined after
all. Also I assume normal operation wouldn't mix the two. Package
management or whatever installs updates would automatically do the right
thing suitable for the system at hand.

> Further, (open)SUSE tends to define the root to boot via `btrfs
> subvolume set-default` which is information in the file system itself,
> neither in the bootloader snipper nor in the naming convention. It's
> neat, but also not discoverable. If users are trying to

The way btrfs is used in openSUSE is based on systems from ten years
ago. A lot has changed since then. Now with the idea to have /usr on a
separate read-only subvolume the current model doesn't really work very
well anymore IMO. So I think there's a window of opportunity to change
the way openSUSE does things :-)

cu
Ludwig

-- 
 (o_   Ludwig Nussel
 //\
 V_/_  http://www.suse.com/
SUSE Software Solutions Germany GmbH, GF: Ivo Totev
HRB 36809 (AG Nürnberg)


More information about the systemd-devel mailing list