[systemd-devel] Incorrect logic when /etc/machine-id is missing

Lennart Poettering lennart at poettering.net
Thu Oct 2 08:00:51 PDT 2014


On Mon, 22.09.14 11:27, Jan Synacek (jsynacek at redhat.com) wrote:

> Hello,
> 
> I believe that the following code is not correct:
> 
> src/core/machine-id-setup.c:188-190:
> 
> mkdir_parents(etc_machine_id, 0755);
> fd = open(etc_machine_id, O_RDWR|O_CREAT|O_CLOEXEC|O_NOCTTY, 0444);
> if (fd >= 0)
>         writable = true;
> else {
>         fd = open(etc_machine_id, O_RDONLY|O_CLOEXEC|O_NOCTTY);
>         if (fd < 0) {
>                 log_error("Cannot open %s: %m", etc_machine_id);
>                 return -errno;
>         }
> 
>         writable = false;
> }
> 
> src/core/machine-id-setup.c:218-249:
> 
> r = generate(id, root);
> if (r < 0)
>         return r;
> 
> if (S_ISREG(st.st_mode) && writable) {
>         lseek(fd, 0, SEEK_SET);
> 
>         if (loop_write(fd, id, 33, false) == 33)
>                 return 0;
> }
> 
> fd = safe_close(fd);
> 
> /* Hmm, we couldn't write it? So let's write it to
>  * /run/machine-id as a replacement */
> 
> RUN_WITH_UMASK(0022) {
>         r = write_string_file(run_machine_id, id);
> }
> if (r < 0) {
>         log_error("Cannot write %s: %s", run_machine_id, strerror(-r));
>         unlink(run_machine_id);
>         return r;
> }
> 
> /* And now, let's mount it over */
> r = mount(run_machine_id, etc_machine_id, NULL, MS_BIND, NULL);
> if (r < 0) {
>         log_error("Failed to mount %s: %m", etc_machine_id);
>         unlink_noerrno(run_machine_id);
>         return -errno;
> }
> 
> If /etc/machine-id is missing on the system, the first open() call
> should probably handle that case. That's actually not true (at least on
> my system), because the underlying filesystem is read-only at that
> time. The second open() call fails as well, because there's no
> /etc/machine-id, resulting in a boot failure.

Yes, correct. We only support booting up with either:

a) /etc/machine-id existing and populated, possibly read-only
b) /etc/machine-id existing and empty, possibly read-only (in this
   case it is overmounted with a randomized file in /run)
c) /etc/machine-id missing, and /etc writable (in which case it is
   initialized on first boot.

We explicitly do not support booting up with the file missing and /etc
being read-only. The machine-id is something we wan to guarantee
existance of, apps should be able to read it during early boot, and be
able to rely on that it won't fail. 

> Now I was determined to fix this bug, however I'm left clueless as to
> how this is actually supposed to work. Is the entire logic in this piece
> of code wrong, or am I missing something? How is the
> (re)generating/mounting of machine-id supposed to work?

I am pretty sure we *should* fail to boot if /etc/machine-id cannot be
initialized because the root dir is read-only. However, we probably
should print a good messages in this case, explaining the rationale.

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list