[systemd-devel] more verbose debug info than systemd.log_level=debug?

Michael Chapman mike at very.puzzling.org
Sun Apr 9 00:11:56 UTC 2017


On Sun, 9 Apr 2017, Chris Murphy wrote:
> On Tue, Apr 4, 2017 at 11:55 AM, Andrei Borzenkov <arvidjaar at gmail.com> wrote:
>> 03.04.2017 07:56, Chris Murphy пишет:
>>> On Thu, Mar 30, 2017 at 6:07 AM, Michael Chapman <mike at very.puzzling.org> wrote:
>>>
>>>> I am not a filesystem developer (IANAFD?), but I'm pretty sure they're going
>>>> to say "the metadata _is_ synced, it's in the journal". And it's hard to
>>>> argue that. After all, the filesystem will be perfectly valid the next time
>>>> it is mounted, after the journal has been replayed, and it will contain all
>>>> data written prior to the sync call. It did exactly what the manpage says it
>>>> does.
>>>
>>> That's their position.
>>>
>>> Also, the same file system dirtiness and journal replay is needed on
>>> ext4. The sample size is too small to say categorically that the same
>>> problem can't happen on ext4 in the same situation. Maybe the grub.cfg
>>> is readable, but maybe the kernel isn't, or the initramfs, or
>>> something else.
>>>
>>
>> Yes, I have seen the same on ext4 which prompted me to play with journal
>> replay code. Unfortunately I do not know how to reliably trigger this
>> condition.
>
> I can reliably trigger a dirty ext4 or XFS file system 100% of the
> time with all recent Fedora installations when doing an offline
> update. What's very non-deterministic is how this dirtiness will
> manifest. Filesystems folks basically live in an alternate reality
> where the farther in time a file system is from mkfs time, the more
> non-deterministic the file system behaves. *shrug*

They don't expect their filesystems to be used except through their own 
filesystem code. It is perfectly deterministic behaviour when their 
filesystem code is used. Their logic seems _very_ reasonable to me.

Don't forget, they've provided an interface for software to use if it 
needs more than the guarantees provided by sync. Informally speaking, the 
FIFREEZE ioctl is intended to place a filesystem into a "fully consistent" 
state, not just a "fully recoverable" state. (Formally it's all a bit 
hazy: POSIX really doesn't guarantee anything with sync.)

Currently systemd calls sync at shutdown. It doesn't need to do that; 
it could have just assumed all other software is written correctly. It 
calls sync as a courtesy to that other software.

I really do think systemd ought to freeze the filesystem at the same time, 
for _exactly the same reasons_. That will solve this Plymouth problem, but 
it will also solve every other software that somebody might run (possibly 
accidentally, possibly not) during late shutdown.

This problem doesn't just affect GRUB, it could affect users of other 
operating systems too. I was speaking to somebody who runs OpenBSD. 
Apparently OpenBSD doesn't have an Ext3 driver, only an Ext2 one, so it is 
somewhat common practice to use an Ext3 filesystem on Linux but mount it 
as Ext2 on OpenBSD. That can only work correctly if the filesystem's 
journal is completely flushed. systemd is the only thing that can do this 
reliably, since it's the only thing running just before the reboot call.


More information about the systemd-devel mailing list