[RFC] initoverlayfs - a scalable initial filesystem

Eric Curtin ecurtin at redhat.com
Mon Dec 11 12:48:10 UTC 2023


On Mon, 11 Dec 2023 at 11:51, Lennart Poettering <lennart at poettering.net> wrote:
>
> On Mo, 11.12.23 11:28, Eric Curtin (ecurtin at redhat.com) wrote:
>
> > > > For the items listed above I think you can find different solutions
> > > > which do not necessarily compromise security as much.
> > > >
> > > > So, in the list above you could address the latter three like this:
> > > >
> > > > 2. Use an erofs rather than a packed cpio as initrd. Make the boot
> > > >    loader load the erofs into contigous memory, then use memmap=X!Y on
> > > >    the kernel cmdline to synthesize a block device from that, which
> > > >    you then mount directly (without any initrd) via
> > > >    root=/dev/pmem0. This means yout boot loader will still load the
> > > >    whole image into memory, but only decompress the bits actually
> > > >    neeed. (It also has some other nice benefits I like, such as an
> > > >    immutable rootfs, which tmpfs-based initrds don't have.)
> >
> > What I am unsure about here, is the "make the bootloader load the
> > erofs into contiguous memory" part. I wonder could we try and use the
> > existing initramfs data as is.
>
> Today's initrds are packed cpio archives of an OS file system
> hierarchy. What I proposed means you'd have to put the OS file system
> hiearchy into an erofs image instead. Which is a trivial operation,
> just unpack and repack.
>
> Note that there are two concepts of "initrd" out there.
>
> a) from the kernel perspective an initrd/initramfs (which both are
>    badly named, because its a tmpfs these days) is that packed cpio
>    archive that is unpacked into a tmpfs, and then jumped into.
>
> b) from systemd's perspective an initrd is an OS image that carries an
>    /etc/initrd-release file. If that file exists then systemd will not
>    boot up the system regularly, but instead just prepare everything
>    that it can transition into some other root fs.
>
> While most often in real life the initrds currently qualify under both
> definitions. But there's no reason to always do this. You can also
> have images the kernel would consider an initrd, but systemd does not,
> which is something we use in the "USI" concept, i.e. "unified system
> images", which are basically UKIs (large UKIs) with a complete rootfs
> that is the main system of the OS. And you can also do it the other
> way round, which is potentially what I am suggesting to you here: use
> an erofs image that would not be considered an initrd by the kernel,
> but that systemd would consider one, and transition out of.
>
> > I dunno if
> > bootloaders make much assumptions about the format of that data, worst
> > case scenario we could encapsulate erofs in the initramfs, cpio looking
> > data.
>
> boot loaders generally don't bother with the cpio, it's just "data"
> for them. Compression algorithms have changed in the past, and it only
> mattered that the kernel could decompress it, the boot loader doesn't care.
>
> > Teach the kernel not to decompress and process the whole
> > thing and mount it like an erofs alternatively. Does this sound crazy
> > or reasonable?
>
> You are re-inventing the traditional "initrd" logic of the kernel
> which was a ramdisk (i.e. a block device /dev/ram0), that was filled
> with some fs of your choice loaded by the boot loader.

Sort of yes, but preferably using that __initramfs_start /
initrd_start buffer as is without copying any bytes anywhere else and
without teaching the bootloaders to do things.

The "memmap=" approach you suggested sounds like what we are thinking,
but do you think we could do this without teaching bootloaders to do
new things?

Although the nice thing about a storage-init like approach is there's
basically zero copies up front. What storage-init is trying to be, is
a tool to just call systemd storage things, without also inheriting
all the systemd stack.

>
> Lennart
>
> --
> Lennart Poettering, Berlin
>



More information about the systemd-devel mailing list