Ideas for completing WIP "add unit to switch back to initrd at shutdown" etc

Tue Jun 12 10:50:01 UTC 2018

Hi, Ray and anyone else.

I see the WIP commits which included "add unit to switch back to initrd 
at shutdown" etc have been reverted for now.  I'm still interested in a 
resolution to the unclean unmounts triggered by plymouth.  Or triggered 
by systemd not supporting plymouth's behaviour... whichever one wants to 
blame.[1]

So I started writing an email, and now I think I understand what it 
would take to finish the WIP, and cover all cases.  This is the 
alternative to what I think was Lennart's suggestion, of plymouth 
implementing the equivalent of `systemctl daemon-reexec` (serializing 
and de-serializing any needed process state).  Of course I'd be 
interested to hear why the WIPs were reverted.  Did the reexec approach 
come to seem more promising? Or was the WIP just taking too long to finish?

* I was unhappy with the WIP originally, because I didn't want my system 
to be left with a frozen graphic (kept up by plymouth-drm-escrow).  I 
would want the option to see the text console messages, like I can with 
the escape key in current plymouth.  But I withdraw this objection: I 
think the design still allows this feature.  It seems straightforward, 
particularly if drm-escrow doesn't bother to implement switching *back* 
to graphics mode :).

* We can have plymouth-switch-root-shutdown.service ordered 
`Before=final.target` and `After=shutdown.target umount.target` in the 
shutdown sequence. That would allow plymouth to animate during the 
shutdown of almost all services.

* When starting plymouthd outside the initramfs, we can allow plymouthd 
to be killed by `systemd-shutdown`.  By not marking it exempt, we would 
make the system more robust.  (E.g. systemd-shutdown may be triggered 
after the default 30 minute timeout on reboot.target, but while 
plymouthd is still running).

   It's a bit hairy to auto-detect whether we're inside the initramfs, 
in case we can't rely on the systemd suggestion of checking for 
/etc/initrd-release.  I think the alternative, is to define that 
plymouth-switch-root.service is the event that causes plymouthd to 
exempt itself from being killed by systemd.

* On systems which *don't* switch to a shutdown initrd, plymouth still 
wants to show the splash until the very end.  This sounds like we have 
two different cases to handle, but we can simply unify them.  If we get 
to the point in the shutdown sequence where plymouth wants to switch to 
the initrd, but we don't see any `/run/initramfs/shutdown`, then exec 
the drm-escrow program from the rootfs instead of from the initrd.

   Note, it is not acceptible to handle the one case by exempting 
plymouthd from being killed by systemd.  1) Current plymouthd keeps 
/var/log/boot.log open for writing.  2) In case plymouthd is upgraded, 
the running process pins a deleted file, and remounting root read-only 
fails with EBUSY.  This is because the deleted file would need to be 
de-allocated (hence writing to the FS) when plymouthd finished.

   In either case, if plymouthd fails to launch drm-escrow, it must 
simply shut down.

* Plymouth-switch-root-shutdown.service will be synchronous.  I.e. it 
will include `plymouth --wait` or equivalent after it signals plymouthd 
to exec drm-escrow.

* We don't need to worry that this configuration doesn't support 
plymouth on X, because the systemd killing sprees would be killing X anyway.

and then

* These still leave plymouth breaching the letter and spirit of the 
systemd RootStorageDaemons feature.  plymouthd started from the 
initramfs asks not to be killed by systemd-shutdown, but it opens files 
on the rootfs which systemd-shutdown needs to unmount.  And in principle 
systemd-shutdown may happen at literally any time, e.g. by the user 
pressing c-a-d 7x in quick succession.

   I think plymouthd can fix this fairly simply.  When plymouthd needs 
to read from or write to the filesystem, it can fork a process, 
equivalent to `popen("cat >\"$FILE\"")`.  (Let's not use popen(), but it 
shows the idea).  The child process can arrange to be killable by 
systemd-shutdown, before it touches the FS.  To log specific errors when 
opening a file, the child process can write to /dev/kmsg.

* This still leaves a potential for plymouth to pin /var/log when 
systemd wants to cleanly unmount /var.  E.g. if you press ctrl+alt+del 
during startup and after the logfile open, systemd will try to 
de-activate var.mount.  plymouth-read-write.service needs to gain 
`RemainAfterExit=yes` and `ExecStop=-/usr/bin/plymouth update-root-fs 
--no-write`. plymouthd will respond to this new command by closing the 
open log file.  (It will close the pipe FD, and then wait for the child 
process to exit.  See previous point).

Regards
Alan

[1] "dracut fails to disassemble device-mapper devices" 
https://bugzilla.redhat.com/show_bug.cgi?id=1575376  This was 
simultaneously reassigned to plymouth, and gained a dracut workaround. 
So I think it's a bit messy to read, sorry.