[systemd-devel] Journal message timestamps

Mark Corbin mark at dibsco.co.uk
Tue Sep 8 11:06:26 UTC 2020



On 04/09/2020 22:02, Dave Howorth wrote:
> On Fri, 4 Sep 2020 17:59:02 +0200
> Lennart Poettering <lennart at poettering.net> wrote:
>> On Do, 27.08.20 11:33, Mark Corbin (mark at dibsco.co.uk) wrote:
>>
>>> Hello
>>>
>>> I am working on time synchronisation issues at boot for systems
>>> without an RTC (using balenaOS on a Raspberry Pi 3) and have some
>>> questions about how journald assigns timestamps to log messages.
>>>
>>> When I boot my system and look at the journal I see an initial
>>> date/time for kernel messages, e.g. '1 June 2020 10:00:00' followed
>>> by messages with the 'correct' date/time once the system time has
>>> been set from another source, e.g. build time, NTP, etc. This means
>>> that over several reboots I have lots of sets of log messages from
>>> 1 June 2020 which understandably confuses the 'journalctl
>>> --list-boots' command. I found an issue that describes the problem
>>> here https://github.com/systemd/systemd/issues/662 and had assumed
>>> that there wasn't anything I could do about this.
>>>
>>> However, when running some tests on a board with Raspberry Pi OS I
>>> found that it didn't suffer from the same problem. RPI OS uses a
>>> 'fake-hwclock' service to restore the previous boot time from a
>>> file and this time gets applied to all messages in the journal
>>> prior to this point. I added the 'fake-hwclock' service to my
>>> system which resulted in the timestamps being correct the majority
>>> of the time, but I was still seeing some boots where the initial
>>> messages were showing '1 June 2020'. I eventually tracked the
>>> intermittent behaviour down to whether the 'fake-hwclock' time
>>> setting occurred in the same system log file as the initial kernel
>>> boot messages. On my RPI OS board the runtime journal was set to
>>> 8MB, so the date/time setting always occurred in the first journal
>>> file. On my system the runtime journal was set to 4MB, so the
>>> date/time setting was sometimes happening in the second journal
>>> file leaving the messages in the first file with a date of '1 June
>>> 2020'.
>>>
>>> So my questions are...
>>>
>>> It seems that journald is using the date/time from the
>>> 'fake-hwclock' service to update the timestamps of earlier log
>>> messages within the same file. Is this correct?
>>
>> No. We do not retroactively change written out records.
>>
>> However, when comparing log entries we prefer the record sequence
>> number if two records come from the same system. And if that doesn't
>> work, then we prefer monotonic clocks if two records cme from the same
>> boot. Wallclock is only used for comparing two records if they are
>> from different boots altogether.
>>
>>> What would be the best technique for ensuring that my journal logs
>>> always display the 'best' time for log messages (either
>>> 'fake-hwclock' or NTP)? Do I always have to ensure that the journal
>>> is large enough to capture my initial time setting event in the
>>> first file?
>>
>> There's no nice answer for that. We do not patch written out journal
>> records once they are written. They are considered immutable. This
>> means, from the journal's PoV if you generated a bunch of records in
>> the initrd or early boot (i.e. before /var/log/journal is available)
>> and you have no usable wallclock time, and you power cycle 10 times
>> in a row, then we have no indication about which of the 10 series of
>> recrods came in which order before the others.
>>
>> To fix that we'd have to keep a separate log of boot ids or so
>> somewhere, which we could use as auxiliary source of truth if all we
>> have are bootids+monotonic time which came first by comparing boot
>> ids. But that would still not be perfect since we could write that out
>> only late (i.e. after /var becomes writable), so the order before that
>> could not be reconstructed either if the system doesn't get that far.
>>
>> Also see:
>>
>> https://github.com/systemd/systemd/issues/662
>>
>>> Any general details about how journald applies timestamps would
>>> also be greatly appreciated.
>>
>> (btw, systemd-timesyncd does what fake-hwclock does automatically, and
>> also does SNTP. it should be fine for most usecases, no need to resort
>> to fake-hwclock)
>>
>> Lennart
> 
> This is a serious problem for anybody running a raspberry pi (i.e.
> many, many thousands of people.) I suggest buying one for yourself and
> actually experiencing the issues to understand what the problem is and
> why people would like to see a solution.
> 
> I'll send you the dollars if that's really a problem.

Hello Dave

If you are able to do some testing on the RPI perhaps you could try my 
suggestion of always starting fake-hwclock before journald.

Change /lib/systemd/system/fake-hwclock.service to add 
'systemd-journald.service' to the end of the 'Before' line.

Let me know if this works for you.

Regards

Mark


More information about the systemd-devel mailing list