[systemd-devel] timesyncd: Frequent polling when no server could be reached

Fri Aug 22 01:13:57 PDT 2014

On Thu, Aug 21, 2014 at 08:15:36PM +0200, Florian Lindner wrote:
> Aug 21 09:45:00 asaru systemd-timesyncd[317]: Timed out waiting for reply 
> from 216.239.32.15:123 (time1.google.com).
> Aug 21 09:45:00 asaru systemd-timesyncd[317]: Using NTP server 
> [2001:4860:4802:32::f]:123 (time1.google.com).
> Aug 21 09:45:00 asaru systemd-timesyncd[317]: Using NTP server 
> 216.239.34.15:123 (time2.google.com).
> Aug 21 09:45:10 asaru systemd-timesyncd[317]: Timed out waiting for reply 
> from 216.239.34.15:123 (time2.google.com).
> Aug 21 09:45:10 asaru systemd-timesyncd[317]: Using NTP server 
> [2001:4860:4802:34::f]:123 (time2.google.com).
> Aug 21 09:45:10 asaru systemd-timesyncd[317]: Using NTP server 
> 216.239.36.15:123 (time3.google.com).
> 
> Polluting my log this way. Is is possible to inhibit that behavior? Maybe 
> trying a couple of times, then giving up until next network status change.

Hm, a well behaved client reduces its polling rate exponentially when
it doesn't receive a reply to avoid overloading the servers and
network congestion.

After running some tests, it seems there is an even bigger problem.
When timesyncd receives a reply saying that the server isn't
synchronized or that the client should reduce its polling rate (KOD
RATE), it selects the next server and sends a new request immediately.
When all servers are unsynchronized, this creates a burst of 10
packets several times per minute.

This really needs to be fixed. An easy solution could be to add an
exponentially increasing delay (with maximum at 2048 seconds) when
all servers were tried and switching back to the first server in the
list.

Clients increasing their polling rate when not receiving reply or
receiving a reply they don't like is the biggest problem the
pool.ntp.org operators have to deal with.

BTW, I was getting segfaults with current git in sd_resolve_getaddrinfo()
in manager_connect() when doing the tests, removing the
server_name_flush_addresses() call seems to fix it.

> Another question I have is about the NTP status output of timedatectl.
> 
> Right now (with ntpd running) it says:
> 
> NTP enabled: yes
> NTP synchronized: no
>  
> I suppose it need some more uptime than the 11 minutes I have currently?

Possibly, ntpd needs to clear the STA_UNSYNC flag in adjtimex to mark
the clock as synchronized.

> When I had timesyncd active it said NTP no, though I had a ntp client 
> runnung, albeit definitly unsynchronized.
> 
> Version is 215

With 215, you need to remove all files in /usr/lib/systemd/ntp-units.d/
except the one that lists the timesyncd service to select it for
timedated. In 216 the ntp-units.d directory is ignored and timedated
always controls timesyncd. I think it would be nice if this was
configurable at least at compile time and I sent a patch for that
yesterday.

-- 
Miroslav Lichvar