[systemd-devel] Automatic journal check?

Lennart Poettering lennart at poettering.net
Tue Dec 2 11:13:39 PST 2014


On Thu, 13.11.14 22:22, Nikolaus Rath (Nikolaus at rath.org) wrote:

> Hello,
> 
> My journal gets corrupted on pretty much a daily basis. I typically
> notice this because things like "systemctl -n 3" take ages to run. When
> I then run "journalctl --verify", I get output like this:

Corrupted journals do not have the effect of slowing things down
really. Badly fragmented journals (as they are common on btrfs, due to
btrfs' limitations) do.

Running "journactl --verify" on a set of journal files that are online
is like running a fsck on a file system you are writing to, and will
of course mean you will run into issues.

> Invalid data object at hash entry 3944 of 233016░░░░░░░░░░░░░░░░░░░  49%
> File corruption detected at /var/log/journal/b865c77cc176b5ef3b69390a0000000d/user-1000 at 0005065350521a47-17e420d2d51ab126.journal~:000000 (of 8388608 bytes, 0%).
> Data object references invalid entry at 5182040███░░░░░░░░░░░░░░░░░  75%
> File corruption detected at /var/log/journal/b865c77cc176b5ef3b69390a0000000d/system at 00050713408c0d34-e40e6aa5c35eb139.journal~:000000 (of 8388608 bytes, 0%).
> FAIL: /var/log/journal/b865c77cc176b5ef3b69390a0000000d/system at 00050713408c0d34-e40e6aa5c35eb139.journal~ (Bad message)
> Data number mismatch██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  39%
> File corruption detected at /var/log/journal/b865c77cc176b5ef3b69390a0000000d/system at 000507165d32850c-5b4cd09ceb6b2ea6.journal~:000000 (of 16777216 bytes, 0%).
> Invalid tail monotonic timestamp░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  48%
> File corruption detected at /var/log/journal/b865c77cc176b5ef3b69390a0000000d/user-65534 at 763da377eefc4369ad61af34c4a5a1c6-00000000000263f0-000504a444037da7.journal:000000 (of 8388608 bytes, 0%).
> 
> 
> This corruption is probably caused by me hard-rebooting the computer
> recently to debug some other issues.

Yes, if you hard reset your system journal files might stay
half-written. However, on the next startup journald will notice that,
and move the journal files away. This worked correctly in your case as
you can see by the "~" suffix the files acquired.

> However, I think it's quite unfortunate that journald isn't able to
> recover from this on its own.

It is. "journalctl --verify" is a very strict. It will show you every
single issue with the file, if it has half-written entries. And journald
rotates the files if it detects that it has half-written
entries. Also, journalctl when reading will actually handle
half-written entries gracefully, and simply show as many entries as it
can, only leaving out the incompletely written ones.

> Is there a reason why the journal doesn't have a "clean" flag like
> regular file systems? This would allow an automatic --verify run when
> the journal has not been closed properly, and would save people like me
> the trouble of monitoring this manually.

There's really no need to ever invoke --verify except to verify
sealing, and even then you only want to invoke it offline.

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list