[systemd-devel] Hosts without /etc/machine-id on boot

Didier Roche didrocks at ubuntu.com
Thu Nov 20 08:23:32 PST 2014


Le 20/11/2014 13:45, Lennart Poettering a écrit :
> On Wed, 19.11.14 09:45, Didier Roche (didrocks at ubuntu.com) wrote:
>
>> Hey,
>>
>> Some other topic related to "empty /etc" discussions: when preparing some
>> generic distro images, we are have the desire to ensure that all new
>> instances will get a different /etc/machine-id file.
>> As part of the empty /etc at boot, we first thought that removing
>> /etc/machine-id would be sufficient, however, the instance then doesn't
>> generate a new machine-id file and complain heavily.
>>
>> The new debug message of systemd 216+ helped shading some lights on it: http://cgit.freedesktop.org/systemd/systemd-stable/diff/src/core/machine-id-setup.c?h=v216-stable&id=896050eeb3acbf4106d71204a5173b4984cf1675,
>> and adding debug statement in machine_id_setup() from
>> src/core/machine-id-setup.c just before "open(etc_machine_id,
>> O_RDWR|O_CREAT|O_CLOEXEC|O_NOCTTY, 0444)" explains what happens with
>> /proc/mounts:
>>
>> [    2.119041] systemd[1]: rootfs / rootfs rw
>> [    2.126775] systemd[1]: /dev/disk/by-uuid/ec8166e5-d5ed-45ec-b350-6cf5773904ac / ext4 ro,relatime,data=ordered
>>
>>
>> It's clear then that at this stage of the boot process / is readonly.
>> The error message (and code) will say that in this case, what is supported
>> is an empty /etc/machine-id. After reboot, the consequence is that
>> /etc/machine-id is mounted as a tmpfs:
> Yes, generation of the machine ID is done very very early at boot,
> before we fork off the first non-PID1 userspace process, and hence
> before any file system could be remounted writable.

That makes sense.
>
>> tmpfs on /etc/machine-id type tmpfs (ro,relatime,size=204948k,mode=755)
>>
>> However, this means is that each boot of this instance will result in a
>> different machine-id, which isn't what is desired in the empty /etc case
>> after a factory reset. I know that there is the utility
>> systemd-machine-id-setup that we are running on systemd postinst in
>> debian/ubuntu, but that doesn't cover the factory reset one.
>>
>> Is there anything obvious that I'm missing to cover that case or anything in
>> the pipe?
> You have a couple of options:
>
> a) make /etc writable before systemd is invoked. If you use an initrd
>     this is without risk, given that the initrd should really invoke
>     fsck on the root disk anyway, and there's hence little reason to
>     transition to a read-only root, rather than just doing rw
>     right-away.

Interesting, I run that through our kernel team. However, we run fsck a 
little bit later on in the boot process to be able to pipe the output to 
plymouth.
I'm not sure we should then have two code paths:
- one fscking from the initrd if /etc/machine-id is empty (like after a 
factory reset), showing the results and eventual failures to the user in 
some way
- and then, the general use case: fscking through the systemd service 
via systemd-fsck-root.service before local-fs.target and piping the 
result in plymouth

>
> b) pre-initialize the machine ID before you boot, at build time.
>
> c) live with random ids

Those 2 are actually nice features, but not applying with a machine if 
we factory reset it with an empty /etc.
>
> d) pass in the id to use via $container_uuid (if you use a container
>     manager), or via the DMI uuid field (if you use kvm). Then, create
>     /etc/machine-id as an empty file, and systemd will initialize it to
>     this ID rather than a random one.
>
> Usually, option d) is preferable in cloud setups I guess since it
> allows seeding the machine id from some externally used UUID, the way
> many container/virtualization managers define one anyway.

Fully agreed (not the case I showed up here, but nice to know we can 
pass pre-generated uuid to containers/vms).
>
> I'd be open to add another option on top of this:
>
> e) boot up with /etc read-only and /etc/machine-id empty, so that the
>     usualy logic of c) generates a random machine id and overmounts
>     /etc/machine-id with it. But then, add a tiny new bootup service,
>     that runs shortly after local-fs.target (i.e. the point where /etc/
>     has been made writable if it's supposed to be made writable
>     according to fstab), and that syncs the random one used so far back
>     to disk, so that at the next boot-up it is fully initialized. This
>     tiny service should be properly conditioned so that it only runs if
>     /etc/machine-id is overmounted and /etc writable
>     (i.e. ConditionPathIsMountPoint=/etc/machine-id and
>     ConditionPathIsReadWrite=/etc). Special care should be taken so
>     that replacement of the mount by the normal file is
>     race-free. (This probably means the tool should open a new ount
>     namespace temporarily, unmount /etc/machine-id there, update the
>     file undearneath and then return to the original mount namespace
>     and unmount the file there too, so that at no time the filw is
>     invalid).
>
> The guarantee with /etc/machine-id is really that it is valid at *any*
> time, in early boot and late boot and all the time in between.
I think I will go that path which is an interesting one and mapping some 
of my thoughts. Thanks for the guidance and documentation on what's the 
right approach to achieve this race-free! I'll work on something around 
that and propose a patch.
This should bring us one step closer (even if it will require an empty 
/etc/machine-id file for now) to the factory reset (with an empty /etc) 
case.

> Hope this makes sense?

It completely does, thanks again for your detailed answer!

Cheers,
Didier


More information about the systemd-devel mailing list