[systemd-devel] systemd-timesync and journalctl questions

Lennart Poettering lennart at poettering.net
Mon Sep 10 19:13:15 UTC 2018


On Fr, 24.08.18 14:52, David Weinehall (david.weinehall at linux.intel.com) wrote:

> We're having two time/date related issues/questions:
> 
> First of all we'd need some counterpart to ntpdate.
> 
> We have a system that lacks an RTC battery--the clock is reasonably reliable once the system
> has booted, but every time the device is restarted it loses system time. Due to the use of the
> machine we cannot allow the ntp server to run (since we need the clock to be monotonic).
> Clock skew is OK, jumps aren't.
> 
> For this purpose we'd want an equivalent to ntpdate to be able to sync the clock once on boot,
> so we can keep systemd-timesync disabled during runtime.

Such a mode is currently not supported by systemd-timesyncd. That
said, I figure we could add that though. Please provide a PR... ;-)

I am not sure how big the benefit of this would be in the general
case, after all it can take arbitrary times until networking is
available, and thus the difference between "continue running after
first sync" and "exit after first sync" might be minimal in many
cases.

> So far both manual reading and googling has failed to turn up any such mode of operation.
> Is there any? If not, would it be hard to implement?

No this shouldn't be too hard. Just make timesyncd exit after the
first sync if some config option is set, and use
RestartPreventExitStatus= to disable automatic restart of the service
in that case, depending on some specific exit code. The patch should
be 30 lines or so (including docs).

> The second time-related issue pertains to journalctl.
> 
> It seems that journalctl logs (or at least displays) events in date/clock order, not in
> sequence order. While this is definitely useful when trying to correlate different logs
> against each other, it also means that events that happen after a date adjustment might
> end up before already existing entries, thus breaking the sequentialness of the log,
> as follows:
> 
> Date incorrect set to 2023:
> 
> Log message 1
> Log message 2
> 
> Date corrected to be 2018:
> 
> Log message 3
> Log message 1
> Log message 2
> 
> Typically this is not how we want our log to behave. Is there any way to
> show the log in sequential order?

So, this is most likely cause by journalctl's journal file
interleaving logic, combined with the fact that journald will
automatically close and start a new journal file whenever a time jump
is detected.

Basically, one idea of journald is to ensure that within each journal
file log entry ordering is strictly monotonic when it comes to the all
three ordering entry metrics: the CLOCK_MONOTONIC timestamp (trivial,
the kernel guarantees montonicity within each boot, as long as we
start a journal file fresh each boot), the journald-maintained
sequence number (trivial, as journald increases it by one on each
entry, hence guaranteed monotonic), and the CLOCK_REALTIME timestamp
(for this we'll close the existing journal file on each clock change,
and open a new one).

During display journalctl will interleave individual journal files
again, trying to be "as deterministic as possible" and preferring the
more reliable ordering metrics over the less reliable onces,
i.e. sequence numbers are preferred, with a fallback to
CLOCK_MONOTONIC with a final fallback to CLOCK_REALTIME
timestamps. Now, this is not as great as it seems at first, as the
three sequences might be contradictory (i.e. because of a
CLOCK_REALTIME jump an entry that is later than another due to seqno
or CLOCK_MONOTONIC might appear earlier if you look at
CLOCK_REALTIME). Moreover, journalctl will refuse comparing entry
order by seqno and CLOCK_MONOTONIC unless the machine ID/boot ID in
the journal files match. This means stuff logged during early boot
will only be ordered by CLOCK_REALTIME usually, but if that is not
reliable you basically have very little to compare the entries for
ordering with.

Or to say this differently: if you have no good clock its tough to
order stuff, unless you have a continously writable storage where you
can count up — but we don't have that, as that could only live in /var
which shows up very late.

The only way out of this is if we'd attach additional post-write
ordering information to each journal file during late boot, long after
we already stopped writing to the files themselves, as soon as we have
persistant counting in /var up. But we currently don't do that, and it
would still be unreliable, as you of course first have to get that far
to get correct ordering...

Classic syslog never had this problem, since they basically only
existed in late boot, i.e. at a point where /var was long writable and
thus where it was easy to keep a clear order as the full log data set
from previous boots was always already available, and you could hence
simply count up from there. Since the journal tries to cover early
boot correctly though, i.e. a time where clocks are not available yet
we have these problems...

I hope the above makes some sense...

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list