[systemd-devel] [RFC] the chopping block

Christian Seiler christian at iwakd.de
Sat Feb 13 13:12:52 UTC 2016


On 02/13/2016 01:01 PM, Lennart Poettering wrote:
> On Sat, 13.02.16 00:10, Christian Seiler (christian at iwakd.de) wrote:
> 
>> On 02/12/2016 10:34 PM, Lennart Poettering wrote:
>>> On Fri, 12.02.16 17:49, Simon McVittie (simon.mcvittie at collabora.co.uk) wrote:
>>>
>>>> On 11/02/16 17:06, Lennart Poettering wrote:
>>>>> 5) Here's the controversial one I think: support for booting up
>>>>>    without /var. We have kludges at quite a few places because we
>>>>>    cannot access /var early during boot.
>>>>
>>>> I don't think /var is really the same thing as /usr: for a start, it has
>>>> to be read/write, whereas /usr and / can be read-only for at least the
>>>> early stages of boot.
>>>>
>>>> On stateless systems with a read-only / and /etc, requiring /var to be
>>>> mounted from the initramfs would mean that the mechanism for setting up
>>>> /var (NFS or tmpfs or whatever) would have to move into the
>>>> initramfs.
>>>
>>> Since initrds tend to cover root-on-nfs, root-on-iscsi and so on
>>> anyway, that sounds like no change in behaviour really..
>>
>> Well, kind-of. The root-on-nfs and root-on-iscsi are dumbed-down
>> versions of what's possible once a system is booted.
> 
> Well, to my understanding dracut and stuff makes pretty much all
> storage tech that is available during the normal system also available
> in the initrd, with the same software, to make sure testing stays
> managable.

Ok, so I just checked.

So the very last version of dracut (released 3 months ago) supports
starting iscsid in the initramfs, older versions do not - and only
if systemd is installed in the initramfs image. If there's no
systemd in the initramfs or an older version is used, dracut uses
the iscsistart binary as I explained. So support for this is very,
very new.

NFS (latest dracut git master): idmapping for NFSv4 is supported to
some extent, but there are a lot of assumptions going in: it assumes
that nsswitch is used for idmapping (which is probably the most
common case on clients, but it need not be the only possibility), it
tries to install all nsswitch modules found in /etc/nsswitch.conf,
but doesn't know anything about their configuration.

What's not supported at all: NFS with Kerberos support. I should note
note that NFS w/ Kerberos doesn't work out of the box for /var on a
running system either, because you still need to fetch a Kerberos
ticket for all the system users that access /var, but it's relatively
easy to set something like that up, because you just need to write
a simple systemd service for that and run it at boot - which most
admins will be able to do - modifying the initramfs is a lot more
complicated (especially because the documentation is sparse and the
criticism of many people that initramfs are black magic is there for
a reason).

So basically: it's probably going to work in many cases as long as
you don't want a Kerberized setup, but not everything will work out
of the box, even if you don't use Kerberos.

NBD: dracut (current git master) currently doesn't even support
/usr on nbd if / isn't on it (and even then only if on the same
device).

And then there are still other setups that I had mentioned.
DRBD with cluster filesystem: not supported in initramfs. sshfs:
doesn't work in initramfs. FUSE filesystems in general: I only
know that some people have experimented with zfs-fuse for the
rootfs, and it was very fiddly at best from what I remember.
And while most people who want to use ZFS on Linux nowadays use
the kernel module (where binaries can't be distributed for legal
reasons), other FUSE filesystems still have the same issue. The
problem is that storage is complicated. You need to special-case
each different storage type and add specific code to make that
work in the initramfs. If you look at the dracut source code,
every different storage solution is special-cased. You need to
make sure that the programs required for that storage work in an
initramfs environment and can either terminated before
switch_root and then restarted in the running system, or
alternatively support being kept around during switch_root. You
need to have glue code that copies the right files into the
initramfs.

For rootfs there are certain limitations that have been widely
accepted, but if you now start to say that lots of common mount
points, especially some parts of /var, to be required to be
mounted in the initramfs, you'll make life a lot more complicated
for a lot of people.

A _lot_ of different projects will have to make sure that their
software now works in an initramfs context, even if it was never
intended to be used for the root filesystem. That's a lot of
work for other people, especially since coordination between
different projects is required, just to save yourself from a bit
of complexity in systemd.

> I think you are extrapolating from limitations of one specific initrd
> implementation, no?

No. dracut has had some improvements in this regard very recently,
but the main point that more complicated storage setups are not
automatically supported in initramfs still stands.

> My maingoal here is really about having read-access to things, and
> being able to schedule stuff based on configuration and existance of
> things, even if that stuff cannot be written to yet. If /var is
> mounted this late it makes things a lot more complex.

So could you perhaps provide a list of things you need *exactly*
and what for? Because maybe there's a better solution for that
than requiring /var to be present at switch_root time.

Regards,
Christian

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20160213/6840e3f6/attachment.sig>


More information about the systemd-devel mailing list