Bug 80582

Summary: timesyncd: clock is off
Product: systemd Reporter: Piotr Lewandowski <piotr.lewandowski+freedesktop>
Component: generalAssignee: systemd-bugs
Status: RESOLVED WORKSFORME QA Contact: systemd-bugs
Severity: normal    
Priority: medium CC: ijanos, john.gosset, kevin, radek, registo.mailling
Version: unspecified   
Hardware: ARM   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: journalctl -b -u systemd-timesyncd
strace
configure log - console
configure log - config.log

Description Piotr Lewandowski 2014-06-27 07:24:32 UTC
I have enabled systemd-timesyncd on my system (Raspberry Pi, no RTC) and now the clock is off by >1 hour.

This happened both on systemd 213 and 214.
I am using Arch Linux.

The time skew is not constant:

# date -u; curl -s -I www.google.com | grep ^Date:
Fri Jun 27 05:26:05 UTC 2014
Date: Fri, 27 Jun 2014 07:21:07 GMT

And 9 days ago:

# date -u; curl -s -I www.google.com | grep ^Date:
Wed Jun 18 18:32:37 UTC 2014
Date: Wed, 18 Jun 2014 19:37:56 GMT

# timedatectl 
      Local time: Wed 2014-06-18 20:33:14 CEST
  Universal time: Wed 2014-06-18 18:33:14 UTC
        RTC time: n/a
       Time zone: Europe/Zurich (CEST, +0200)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: yes
 Last DST change: DST began at
                  Sun 2014-03-30 01:59:59 CET
                  Sun 2014-03-30 03:00:00 CEST
 Next DST change: DST ends (the clock jumps one hour backwards) at
                  Sun 2014-10-26 02:59:59 CEST
                  Sun 2014-10-26 02:00:00 CET

# journalctl -n24 _SYSTEMD_UNIT=systemd-timesyncd.service
-- Logs begin at Thu 1970-01-01 01:00:09 CET, end at Wed 2014-06-18 20:31:56 CEST. --
Jun 18 20:31:56 hostname systemd-timesyncd[601]: Sent NTP request to 216.239.32.15:123 (time1.google.com).
Jun 18 20:31:56 hostname systemd-timesyncd[601]: NTP response:
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   leap         : 0
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   version      : 4
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   mode         : 4
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   stratum      : 2
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   precision    : 0.000001 sec (-20)
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   reference    : n/a
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   origin       : 1403116316.737
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   receive      : 3612109034.793
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   transmit     : 3612109034.793
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   dest         : 1403116316.737
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   offset       : +0.000 sec
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   delay        : +0.000 sec
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   packet count : 3
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   jitter       : 0.000
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   poll interval: 256
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   adjust (slew): +0.000 sec
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   status       : 8193 sync
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   time now     : 1403116316.595
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   constant     : 4
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   offset       : +0.000 sec
Jun 18 20:31:56 hostname systemd-timesyncd[601]:   freq offset  : +0 (0 ppm)
Jun 18 20:31:56 hostname systemd-timesyncd[601]: interval/delta/delay/jitter/drift 256s/+0.000s/0.000s/0.000s/+0ppm

# date -d @1403116316
Wed Jun 19 20:31:56 CEST 2014

# stat /var/lib/systemd/clock 
  File: '/var/lib/systemd/clock'
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: b305h/45829d    Inode: 23745       Links: 1
Access: (0644/-rw-r--r--)  Uid: (  192/systemd-timesync)   Gid: (  192/systemd-timesync)
Access: 2014-06-18 20:36:12.852513000 +0200
Modify: 2014-06-18 20:36:12.852513000 +0200
Change: 2014-06-18 20:36:12.852513000 +0200
 Birth: -
Comment 1 Kay Sievers 2014-06-27 13:29:39 UTC
What's the output of:
  $ journalctl -b -u systemd-timesyncd

A few hours after reboot.
Comment 2 Piotr Lewandowski 2014-06-27 16:54:45 UTC
Created attachment 101886 [details]
journalctl -b -u systemd-timesyncd

Output attached.
Comment 3 Kay Sievers 2014-06-27 17:00:56 UTC
Something looks wrong on your system, the calls are all:
  interval/delta/delay/jitter/drift 2048s/+0.000s/0.000s/0.000s/+0ppm
which should never happen. It needs to look like:
  interval/delta/delay/jitter/drift 1024s/+0.002s/0.046s/0.002s/+3ppm

It really does not look like a systemd issue. Does the normal ntpd work?
Comment 4 Kay Sievers 2014-06-27 17:03:03 UTC
Maybe attach strace to the running process and paste the output here, if it
reveals something?
Comment 5 Piotr Lewandowski 2014-06-27 17:04:51 UTC
# date -u 
Fri Jun 27 15:08:42 UTC 2014
# systemctl stop systemd-timesyncd
# date -u
Fri Jun 27 15:08:50 UTC 2014
# systemctl start ntpd
# date -u 
Fri Jun 27 17:04:07 UTC 2014
Comment 6 Kay Sievers 2014-06-27 17:13:09 UTC
How old is the kernel?

If you do:
  systemctl stop systemd-timesyncd.service
  strace /usr/lib/systemd/systemd-timesyncd

it should have lines like:
  clock_adjtime(CLOCK_REALTIME,
  {modes=ADJ_OFFSET|ADJ_MAXERROR|ADJ_ESTERROR|ADJ_STATUS|ADJ_TIMECONST|0x2000,
  offset=4377840, freq=2641005, maxerror=0, esterror=0, status=STA_PLL|STA_NANO,
  constant=2, precision=1, tolerance=32768000, time={1403889070, 67063019},
  tick=10000, ppsfreq=0, jitter=0, shift=0, stabil=0, jitcnt=0, calcnt=0,
  errcnt=0, stbcnt=0}) = 0 (TIME_OK)
Comment 7 Piotr Lewandowski 2014-06-27 17:22:18 UTC
Created attachment 101888 [details]
strace

strace fragment attached

Kernel is 3.12.22-1-ARCH.
Comment 8 Kay Sievers 2014-06-27 17:46:13 UTC
Hmm, looking at the 0 values here, maybe the floating point stuff does
not work as expected on that platform:

  origin       : 1403116316.737
  receive      : 3612109034.793
  transmit     : 3612109034.793
  dest         : 1403116316.737
  offset       : +0.000 sec
  delay        : +0.000 sec

I don't know much really about ARM or where to look or what to check for.
Comment 9 Mauro Santos 2014-07-14 22:39:53 UTC
I'm also seeing this, using systemd 215.

I have exactly the same setup on my raspberry and my laptop (regarding time syncing) and it does nothing on the raspberry but works on the laptop.

Might this be because the raspberry has no rtc hardware?

I have noticed something that might provide some clue, if one uses ntpdate to sync time, then for some reason systemd-timesyncd seems to start doing something. Something like this:

Jul 14 22:58:23 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 2048s/+0.000s/1.000s/0.000s/+0ppm
Jul 14 23:32:31 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 2048s/+0.000s/0.000s/0.000s/+0ppm
Jul 14 23:05:25 archrpi systemd-timesyncd[3553]: System time changed. Resyncing.
Jul 14 23:05:26 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 32s/+0.000s/0.000s/0.000s/+0ppm
Jul 14 23:05:58 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 64s/+0.000s/1.000s/0.000s/+0ppm
Jul 14 23:07:02 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 128s/+0.000s/0.000s/0.000s/+0ppm
Jul 14 23:09:10 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 256s/+0.000s/0.000s/0.000s/+0ppm
Jul 14 23:09:43 archrpi systemd[1]: Stopping Network Time Synchronization...
Jul 14 23:09:43 archrpi systemd[1]: Starting Network Time Synchronization...
Jul 14 23:09:43 archrpi systemd[1]: Started Network Time Synchronization.
Jul 14 23:09:43 archrpi systemd-timesyncd[4871]: Using NTP server 216.239.32.15:123 (time1.google.com).
Jul 14 23:09:43 archrpi systemd-timesyncd[4871]: interval/delta/delay/jitter/drift 64s/+0.000s/0.000s/0.000s/+0ppm
Jul 14 23:10:48 archrpi systemd-timesyncd[4871]: interval/delta/delay/jitter/drift 128s/+0.000s/1.000s/0.000s/+0ppm
Jul 14 23:12:56 archrpi systemd-timesyncd[4871]: interval/delta/delay/jitter/drift 256s/+0.000s/0.000s/0.000s/-24ppm
Jul 14 23:17:12 archrpi systemd-timesyncd[4871]: interval/delta/delay/jitter/drift 512s/+0.000s/0.000s/0.000s/-24ppm
Jul 14 23:25:44 archrpi systemd-timesyncd[4871]: interval/delta/delay/jitter/drift 1024s/+0.000s/0.000s/0.000s/-24ppm

I have used ntpdate at 23:05:25 to sync time. It seems I have restarted systemd-timesyncd but I don't remember doing so (lost track of all the things/restarts I've made trying to get this to work).
Comment 10 Kay Sievers 2014-07-14 22:58:24 UTC
All the 0s are still an indication that something with the floating
point arithmetic is broken on that machine, or with the systemd
compilation:
 interval/delta/delay/jitter/drift 64s/+0.000s/0.000s/0.000s/+0ppm

Running ntpdate will just wake up timesysncd and adjust the kernel's drift
(the ppm value), but not make any other difference.
Comment 11 Mauro Santos 2014-07-15 11:38:34 UTC
(In reply to comment #10)
> All the 0s are still an indication that something with the floating
> point arithmetic is broken on that machine, or with the systemd
> compilation:
>  interval/delta/delay/jitter/drift 64s/+0.000s/0.000s/0.000s/+0ppm

You were on to something, I have rebuilt the whole systemd package on the raspberry itself and reinstalled it, now it seems to be working (hope I'm not talking too soon). For good measure I've removed ntp, rebooted and checked if there were any ntp processes running that could be masking the problem.

> journalctl -b --no-pager -u systemd-timesyncd 
-- Logs begin at Qui 1970-01-01 01:00:04 CET, end at Ter 2014-07-15 12:02:16 WEST. --
Jan 01 01:00:08 archrpi systemd[1]: Starting Network Time Synchronization...
Jan 01 01:00:08 archrpi systemd-timesyncd[136]: System clock time unset or jumped backwards, restoring from recorded timestamp: Tue 2014-07-15 10:19:29 WEST
Jul 15 10:19:29 archrpi systemd[1]: Started Network Time Synchronization.
Jul 15 10:19:34 archrpi systemd-timesyncd[136]: Network configuration changed, trying to establish connection.
Jul 15 10:19:34 archrpi systemd-timesyncd[136]: Network configuration changed, trying to establish connection.
Jul 15 10:19:35 archrpi systemd-timesyncd[136]: Network configuration changed, trying to establish connection.
Jul 15 10:20:06 archrpi systemd-timesyncd[136]: Using NTP server 216.239.36.15:123 (time3.google.com).
Jul 15 10:20:22 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 32s/+16.811s/0.059s/0.000s/+0ppm
Jul 15 10:20:55 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 64s/-0.001s/0.058s/0.000s/+0ppm
Jul 15 10:21:59 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 128s/-0.002s/0.059s/0.001s/-7ppm
Jul 15 10:24:07 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 256s/-0.003s/0.058s/0.002s/-12ppm
Jul 15 10:28:23 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 512s/-0.004s/0.059s/0.002s/-16ppm
Jul 15 10:36:56 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 1024s/+0.007s/0.085s/0.004s/-16ppm (ignored)
Jul 15 10:54:00 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 2048s/-0.000s/0.095s/0.004s/-16ppm
Jul 15 11:28:08 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 2048s/-0.042s/0.059s/0.016s/-16ppm (ignored)
Jul 15 12:02:16 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 2048s/-0.068s/0.057s/0.063s/-37ppm

If you agree that timesyncd is now working as it should, we have to go complain somewhere else as this isn't a problem with systemd.

Just another related question, is it intended that there is no man page for timesyncd.conf or might it be that it is just missing from arch's package?

Thanks for your quick response on this and pointing me on the right direction.
Comment 12 Kay Sievers 2014-07-15 17:13:29 UTC
(In reply to comment #11)
> Jul 15 10:20:06 archrpi systemd-timesyncd[136]: Using NTP server
> 216.239.36.15:123 (time3.google.com).
> Jul 15 10:20:22 archrpi systemd-timesyncd[136]:
> interval/delta/delay/jitter/drift 32s/+16.811s/0.059s/0.000s/+0ppm

Great, this looks fine, yes.

It might be something in the build-sys, which compiled the original binary for
the distribution, which breaks the floating point stuff. Maybe just a compile
option or a cross-compile issue, or so.
Comment 13 Kevin 2014-07-16 12:56:09 UTC
As the packager for this, I'd like to note that systemd is the only package being built in the entire distribution that has shown this kind of issue. For me, this points back to something off in the systemd code/build that is perhaps being done improperly.

All packages are built natively on ARM systems, but compiling is exported via distcc to x86 machines with cross compilers version-matched to what is used on the ARM side. There is nothing in this process that has had issues in the years we have been doing this with every package for the distribution. If you have thoughts on anything to look for to track down what exactly is happening, I'd be happy to look into it.
Comment 14 Zbigniew Jedrzejewski-Szmek 2014-07-16 13:03:06 UTC
Can you post the output of ./configure on both systems? Our autoconf+make stuff doesn't really do anything strange with the compliation flags, mostly enables a bunch of warnings and pie and lto.
Comment 15 Piotr Lewandowski 2014-07-16 23:24:22 UTC
Arch Linux ARM bug: https://github.com/archlinuxarm/PKGBUILDs/issues/900
Comment 16 Zbigniew Jedrzejewski-Szmek 2014-07-16 23:29:28 UTC
(In reply to comment #15)
> Arch Linux ARM bug: https://github.com/archlinuxarm/PKGBUILDs/issues/900

What an exemplary example of sweeping under the rug :(
Comment 17 Kevin 2014-07-16 23:38:07 UTC
(In reply to comment #16)
> What an exemplary example of sweeping under the rug :(

Excuse you? Did I not post to this bug report as soon as I had confirmed what was causing it?
Comment 18 Zbigniew Jedrzejewski-Szmek 2014-07-16 23:49:54 UTC
(In reply to comment #17)
> (In reply to comment #16)
> > What an exemplary example of sweeping under the rug :(
> 
> Excuse you? Did I not post to this bug report as soon as I had confirmed
> what was causing it?

Maybe I wasn't clear enough: I meant the closing of the upstream bug, not posting of the link here.

And no, we still have no idea what was causing it. "It went away when distcc was disabled" does not count.
Comment 19 Kevin 2014-07-16 23:59:31 UTC
I closed our bug because the package now works.

Perhaps you can take a moment to read my first comment, number 13, where I bring the conversation here to determine with systemd developers what the underlying problem is. This happened right after closing our bug. 

If what I can expect is attacks here, the rug will be used. This is ridiculous.
Comment 20 Zbigniew Jedrzejewski-Szmek 2014-07-17 00:26:25 UTC
(In reply to comment #19)
> I closed our bug because the package now works.
> 
> Perhaps you can take a moment to read my first comment, number 13, where I
> bring the conversation here to determine with systemd developers what the
> underlying problem is. This happened right after closing our bug. 
So, even if there's a bug in the timesyncd code, it is only exposed by your build system. It is pretty obvious from the comments from Kay and others, that it works for him, and for others, and is unlikely to be fixed except by people who are using the build in question. I'd try comparing the disassembly of a failed build with one that works, or maybe run it under gdb and look at the values, or something. I tried to reproduce this on an arm chromebook without success.

> If what I can expect is attacks here, the rug will be used. This is
> ridiculous.
Please don't treat this is as an attack. If your intent is to continue debugging this, then please consider my comment withdrawn.
Comment 21 Kevin 2014-07-17 01:21:00 UTC
Created attachment 102952 [details]
configure log - console
Comment 22 Kevin 2014-07-17 01:21:40 UTC
Created attachment 102953 [details]
configure log - config.log
Comment 23 Kevin 2014-07-17 01:23:24 UTC
There were no differences in configure logs between distcc and native, aside from configure correctly adding the distcc path and choosing its gcc over the system's. Both the console output and config.log from the configure invocation are attached.
Comment 24 Kevin 2014-07-17 02:32:45 UTC
This appears to be an optimization bug right now.  Initially using the original build with -O2 (with distcc), and stepping through manager_receive_response in timesyncd.c (the apparent origin of the issue), I noticed many of the variables having been <optimized out> while inspecting through lines 681-687.  Of particular interest the receive and trans variables.

Rebuilding with -O0 (with distcc) allowed variable inspection and saw everything being set appropriately, and it continued calculating everything fine:
interval/delta/delay/jitter/drift 64s/-0.012s/0.035s/0.000s/-45ppm
interval/delta/delay/jitter/drift 128s/+0.007s/0.032s/0.003s/-17ppm

Running this -O0 version outside of gdb also shows normal operation.
Comment 25 John Gosset 2014-07-18 10:06:38 UTC
The -O0 rebuild seems to have also resolved the systemd-networkd DHCP issue here:

https://bugs.freedesktop.org/show_bug.cgi?id=77850
Comment 26 Lennart Poettering 2014-08-18 21:39:13 UTC
OK, closing. 

If this turns out not to be a compiler problem after all but a systemd problem, feel free to reopen.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.