[RFC] initoverlayfs - a scalable initial filesystem

Tue Dec 12 20:34:47 UTC 2023

Hi, while I have been following this thread passively for now I also
wanted to chime in.

> (The main reason why sd-stub doesn't actually support erofs-initrds,
> is that sd-stub also generates initrd cpios on the fly, to pass
> credentials and system extension images to the kernel, and you can't
> really mix erofs and cpio initrds into one)

What prevents one from mixing the two (especially given that the
hypothetical erofs initrd support does not yet exist)?
Or are you talking about mixing this with your memmap+root=/dev/pmem suggestion?

> The try to optimize the initrd a bit by making it an erofs/memmap
> thing and so on. And make sure the initrd only contains stuff you
> always need, so that reading it all into memory is necessary anyway,
> and hence any approach that tries to run even the initrd off a disk
> image won't be necessary becuase you need to read everything anyway.

Having to ensure that the initrd is as small as possible is definitely
no easy task.
Furthermore unless one has total control over the devices, or even if
there are only a few hardware revisions, parts of the initrd might not
be used.
Even if everything is the same there are codes paths which might not
be taken during usual operation. An example would be services similar
to the new systemd-bsod which are only triggered in emergencies.
Having these in the cpio means that they will always be read and
decompressed.
Using sysexts also has the drawback that each and every one of them
has to be decompressed. I might be mistaken but I expect that this
will be the case even if the extension-release in the sysext results
in it being discarded which is obviously another big drawback.

Regardless, even if every single file within the cpio archive (and
potential sysexts) is used, erofs still has a distinct advantage over
cpio!
With cpio everything has to be decompressed and read up front. With
erofs this is not the case.
Only the fs header has to be read at first as files are decompressed on demand.
This means that critical stuff can be started earlier as it does not
have to wait for decompression of stuff only needed later on.
For example an initrd-only (i.e. not pivolint root), graphical system
could start all background services long before the UI starts and
accesses large asset files.

I agree that this splitting up into another micro-initrd just for some
storage stuff etc (which I still have not groked completely) does not
seem to offer any advantages to what we have today. *However*, I
certainly think that standardizing and supporting some kind of erofs
based initrd would gain some advantages.

On the other hand this feels like going back to an old ramdisk again.
This goes beyond my knowledge but based on the kernel docs most
drawbacks of ramdisks would not apply to an approach with erofs. Also
maybe the more flexible loopback devices could be used(?) which might
alleviate some problems.

-- This block device was of fixed size, so the filesystem mounted on
it was of fixed size.
   -> Should not be of concern as it is readonly anyhow.
-- Using a ram disk also required unnecessarily copying memory from
the fake block device into the page cache (and copying changes back
out), as well as creating and destroying dentries.
   -> (?) This one I am actually not too sure about and supersedes my
knowledge on tmpfs, vfs (and its cache layers), erofs caching, and
loopback devices).
-- Plus it needed a filesystem driver (such as ext2) to format and
interpret this data.
   -> erofs is already included in most initrds (and is not too big if
it is not)

Regards, Nils