[systemd-devel] more verbose debug info than systemd.log_level=debug?

Tue Apr 11 02:20:54 UTC 2017

On Mon, Apr 10, 2017 at 4:44 AM, Lennart Poettering
<lennart at poettering.net> wrote:
> On Mon, 10.04.17 19:07, Michael Chapman (mike at very.puzzling.org) wrote:
>
>> > So no, "freeze" is not an option. That sounds like a recipe to make
>> > shutdown hang. We need a sync() that actually does what is documented
>> > and sync the file system properly.
>>
>> sync() is never going to work the way you want it to work. Let's make
>> systemd work correctly for the systems we have today, not some hypothetical
>> system of the future.
>
> It works the way I want on vfat, ext2. The problem you are having is
> specific to XFS, no?

ext3 and ext4 are dirty also after doing updates; it's just not
causing boot failure, but during startup, fsck is fixing things.

Btrfs doesn't complain, but btrfs-debug-tree immediately after the
offline update reboot (without mounting), compared to btrfs-debug-tree
following a mount (but not booting, reading, or modifying anything)
shows considerable changes are made to the file system just due to the
mount. So something was left stale, and I'm guessing it was sync()
causing things to get stuffed into the log tree; which is then cleaned
up at next mount. It's not corruption, it's not even really dirty in
Btrfs semantics, but functionally I guess you'd say it was fixing
itself back up, per design.

And BTW, this is in the XFS list thread, but it's not merely the
grub.cfg that's missing in action. It's a large pile of files
including the kernel and initramfs. None of those new files exist yet
from the perspective of the bootloader.

>
>> The filesystem developers have good reasons for sync()'s current behaviour.
>> I can only point out again that the way they've designed it does *not* lose
>> or corrupt data: all synced data is available as soon as the filesystem
>> journals have been flushed. We have to explicitly flush the journals
>> ourselves, one way or another, to ensure that GRUB and other
>> not-fully-Linux-compatible filesystem implementations work correctly.
>
> The data *is* lost from the perspective of a boot loader. And given
> that /boot is pretty much exclusively about boot loading, that's kinda
> major.

Right. So let's play the blame game for a sec:

1. The kernel update package is most responsible for the change in
boot state. It's changing kernel, modules, initramfs, and the
bootloader configuration file. So it could be argued, this is the
thing that should do freeze/thaw to make certain the bootloader will
still be happy at next boot.

2. Bootloader has no fallback. The bootloader configuration is
modified in a non-atomic way. In a sense, we should have
bootloader.old and bootloader.new and use preferably the new one but
if not found use the old (unmodifed) one. At the least, we get a
normal boot with the old configuration and kernel, the kernel code
cleans up the file system so now the next boot has the updated kernel
and bootloader config.

3. Blame the thing that prevents umount and remount-ro: in the example
case it's plymouth.

4. Systemd for not enforcing limited kill exemption to those running
from initramfs, i.e. ignore kill exemption if the program is running
other than initramfs.

5. The OS installer. It might very well be we've passed the point
where it's safe for /boot to be a directory on rootfs. If almost
anything can someday pin the file system and prevent umount or
remount-ro, and thereby make kernel, initramfs, and bootloader config
file changes invisible to the bootloader - that's a good reason to
separate those files from a pinned file system.

This bug is interesting because all of these are valid to blame. But
which is the most convincing? It's sortof difficult. And in the end,
it might be the least to blame is the the best position to just
clobber the problem, preventing it from happening for all use cases.

>
> Note that these weird XFS semantics are not only a problem on systemd
> btw: they are much worse on sysvinit and other simpler init systems,
> since they generally don't have the kill/umount/remount/detach loop we
> have, and don't support transitioning back into the initrd for
> complete detaching/umounting of the root fs either.
>
> Hence, any claims by the xfs folks that systemd doesn't disassemble
> things the right way is very wrong: systemd is certainly the one
> implementation that has a better chance to keep xfs sane than any
> other...

Yes, I think that assertion made on the XFS list by one developer is
unconvincing.

-- 
Chris Murphy