Summary: | timesyncd: clock is off | ||
---|---|---|---|
Product: | systemd | Reporter: | Piotr Lewandowski <piotr.lewandowski+freedesktop> |
Component: | general | Assignee: | systemd-bugs |
Status: | RESOLVED WORKSFORME | QA Contact: | systemd-bugs |
Severity: | normal | ||
Priority: | medium | CC: | ijanos, john.gosset, kevin, radek, registo.mailling |
Version: | unspecified | ||
Hardware: | ARM | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
journalctl -b -u systemd-timesyncd
strace configure log - console configure log - config.log |
Description
Piotr Lewandowski
2014-06-27 07:24:32 UTC
What's the output of: $ journalctl -b -u systemd-timesyncd A few hours after reboot. Created attachment 101886 [details]
journalctl -b -u systemd-timesyncd
Output attached.
Something looks wrong on your system, the calls are all: interval/delta/delay/jitter/drift 2048s/+0.000s/0.000s/0.000s/+0ppm which should never happen. It needs to look like: interval/delta/delay/jitter/drift 1024s/+0.002s/0.046s/0.002s/+3ppm It really does not look like a systemd issue. Does the normal ntpd work? Maybe attach strace to the running process and paste the output here, if it reveals something? # date -u Fri Jun 27 15:08:42 UTC 2014 # systemctl stop systemd-timesyncd # date -u Fri Jun 27 15:08:50 UTC 2014 # systemctl start ntpd # date -u Fri Jun 27 17:04:07 UTC 2014 How old is the kernel? If you do: systemctl stop systemd-timesyncd.service strace /usr/lib/systemd/systemd-timesyncd it should have lines like: clock_adjtime(CLOCK_REALTIME, {modes=ADJ_OFFSET|ADJ_MAXERROR|ADJ_ESTERROR|ADJ_STATUS|ADJ_TIMECONST|0x2000, offset=4377840, freq=2641005, maxerror=0, esterror=0, status=STA_PLL|STA_NANO, constant=2, precision=1, tolerance=32768000, time={1403889070, 67063019}, tick=10000, ppsfreq=0, jitter=0, shift=0, stabil=0, jitcnt=0, calcnt=0, errcnt=0, stbcnt=0}) = 0 (TIME_OK) Created attachment 101888 [details]
strace
strace fragment attached
Kernel is 3.12.22-1-ARCH.
Hmm, looking at the 0 values here, maybe the floating point stuff does not work as expected on that platform: origin : 1403116316.737 receive : 3612109034.793 transmit : 3612109034.793 dest : 1403116316.737 offset : +0.000 sec delay : +0.000 sec I don't know much really about ARM or where to look or what to check for. I'm also seeing this, using systemd 215. I have exactly the same setup on my raspberry and my laptop (regarding time syncing) and it does nothing on the raspberry but works on the laptop. Might this be because the raspberry has no rtc hardware? I have noticed something that might provide some clue, if one uses ntpdate to sync time, then for some reason systemd-timesyncd seems to start doing something. Something like this: Jul 14 22:58:23 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 2048s/+0.000s/1.000s/0.000s/+0ppm Jul 14 23:32:31 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 2048s/+0.000s/0.000s/0.000s/+0ppm Jul 14 23:05:25 archrpi systemd-timesyncd[3553]: System time changed. Resyncing. Jul 14 23:05:26 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 32s/+0.000s/0.000s/0.000s/+0ppm Jul 14 23:05:58 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 64s/+0.000s/1.000s/0.000s/+0ppm Jul 14 23:07:02 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 128s/+0.000s/0.000s/0.000s/+0ppm Jul 14 23:09:10 archrpi systemd-timesyncd[3553]: interval/delta/delay/jitter/drift 256s/+0.000s/0.000s/0.000s/+0ppm Jul 14 23:09:43 archrpi systemd[1]: Stopping Network Time Synchronization... Jul 14 23:09:43 archrpi systemd[1]: Starting Network Time Synchronization... Jul 14 23:09:43 archrpi systemd[1]: Started Network Time Synchronization. Jul 14 23:09:43 archrpi systemd-timesyncd[4871]: Using NTP server 216.239.32.15:123 (time1.google.com). Jul 14 23:09:43 archrpi systemd-timesyncd[4871]: interval/delta/delay/jitter/drift 64s/+0.000s/0.000s/0.000s/+0ppm Jul 14 23:10:48 archrpi systemd-timesyncd[4871]: interval/delta/delay/jitter/drift 128s/+0.000s/1.000s/0.000s/+0ppm Jul 14 23:12:56 archrpi systemd-timesyncd[4871]: interval/delta/delay/jitter/drift 256s/+0.000s/0.000s/0.000s/-24ppm Jul 14 23:17:12 archrpi systemd-timesyncd[4871]: interval/delta/delay/jitter/drift 512s/+0.000s/0.000s/0.000s/-24ppm Jul 14 23:25:44 archrpi systemd-timesyncd[4871]: interval/delta/delay/jitter/drift 1024s/+0.000s/0.000s/0.000s/-24ppm I have used ntpdate at 23:05:25 to sync time. It seems I have restarted systemd-timesyncd but I don't remember doing so (lost track of all the things/restarts I've made trying to get this to work). All the 0s are still an indication that something with the floating point arithmetic is broken on that machine, or with the systemd compilation: interval/delta/delay/jitter/drift 64s/+0.000s/0.000s/0.000s/+0ppm Running ntpdate will just wake up timesysncd and adjust the kernel's drift (the ppm value), but not make any other difference. (In reply to comment #10) > All the 0s are still an indication that something with the floating > point arithmetic is broken on that machine, or with the systemd > compilation: > interval/delta/delay/jitter/drift 64s/+0.000s/0.000s/0.000s/+0ppm You were on to something, I have rebuilt the whole systemd package on the raspberry itself and reinstalled it, now it seems to be working (hope I'm not talking too soon). For good measure I've removed ntp, rebooted and checked if there were any ntp processes running that could be masking the problem. > journalctl -b --no-pager -u systemd-timesyncd -- Logs begin at Qui 1970-01-01 01:00:04 CET, end at Ter 2014-07-15 12:02:16 WEST. -- Jan 01 01:00:08 archrpi systemd[1]: Starting Network Time Synchronization... Jan 01 01:00:08 archrpi systemd-timesyncd[136]: System clock time unset or jumped backwards, restoring from recorded timestamp: Tue 2014-07-15 10:19:29 WEST Jul 15 10:19:29 archrpi systemd[1]: Started Network Time Synchronization. Jul 15 10:19:34 archrpi systemd-timesyncd[136]: Network configuration changed, trying to establish connection. Jul 15 10:19:34 archrpi systemd-timesyncd[136]: Network configuration changed, trying to establish connection. Jul 15 10:19:35 archrpi systemd-timesyncd[136]: Network configuration changed, trying to establish connection. Jul 15 10:20:06 archrpi systemd-timesyncd[136]: Using NTP server 216.239.36.15:123 (time3.google.com). Jul 15 10:20:22 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 32s/+16.811s/0.059s/0.000s/+0ppm Jul 15 10:20:55 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 64s/-0.001s/0.058s/0.000s/+0ppm Jul 15 10:21:59 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 128s/-0.002s/0.059s/0.001s/-7ppm Jul 15 10:24:07 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 256s/-0.003s/0.058s/0.002s/-12ppm Jul 15 10:28:23 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 512s/-0.004s/0.059s/0.002s/-16ppm Jul 15 10:36:56 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 1024s/+0.007s/0.085s/0.004s/-16ppm (ignored) Jul 15 10:54:00 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 2048s/-0.000s/0.095s/0.004s/-16ppm Jul 15 11:28:08 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 2048s/-0.042s/0.059s/0.016s/-16ppm (ignored) Jul 15 12:02:16 archrpi systemd-timesyncd[136]: interval/delta/delay/jitter/drift 2048s/-0.068s/0.057s/0.063s/-37ppm If you agree that timesyncd is now working as it should, we have to go complain somewhere else as this isn't a problem with systemd. Just another related question, is it intended that there is no man page for timesyncd.conf or might it be that it is just missing from arch's package? Thanks for your quick response on this and pointing me on the right direction. (In reply to comment #11) > Jul 15 10:20:06 archrpi systemd-timesyncd[136]: Using NTP server > 216.239.36.15:123 (time3.google.com). > Jul 15 10:20:22 archrpi systemd-timesyncd[136]: > interval/delta/delay/jitter/drift 32s/+16.811s/0.059s/0.000s/+0ppm Great, this looks fine, yes. It might be something in the build-sys, which compiled the original binary for the distribution, which breaks the floating point stuff. Maybe just a compile option or a cross-compile issue, or so. As the packager for this, I'd like to note that systemd is the only package being built in the entire distribution that has shown this kind of issue. For me, this points back to something off in the systemd code/build that is perhaps being done improperly. All packages are built natively on ARM systems, but compiling is exported via distcc to x86 machines with cross compilers version-matched to what is used on the ARM side. There is nothing in this process that has had issues in the years we have been doing this with every package for the distribution. If you have thoughts on anything to look for to track down what exactly is happening, I'd be happy to look into it. Can you post the output of ./configure on both systems? Our autoconf+make stuff doesn't really do anything strange with the compliation flags, mostly enables a bunch of warnings and pie and lto. Arch Linux ARM bug: https://github.com/archlinuxarm/PKGBUILDs/issues/900 (In reply to comment #15) > Arch Linux ARM bug: https://github.com/archlinuxarm/PKGBUILDs/issues/900 What an exemplary example of sweeping under the rug :( (In reply to comment #16) > What an exemplary example of sweeping under the rug :( Excuse you? Did I not post to this bug report as soon as I had confirmed what was causing it? (In reply to comment #17) > (In reply to comment #16) > > What an exemplary example of sweeping under the rug :( > > Excuse you? Did I not post to this bug report as soon as I had confirmed > what was causing it? Maybe I wasn't clear enough: I meant the closing of the upstream bug, not posting of the link here. And no, we still have no idea what was causing it. "It went away when distcc was disabled" does not count. I closed our bug because the package now works. Perhaps you can take a moment to read my first comment, number 13, where I bring the conversation here to determine with systemd developers what the underlying problem is. This happened right after closing our bug. If what I can expect is attacks here, the rug will be used. This is ridiculous. (In reply to comment #19) > I closed our bug because the package now works. > > Perhaps you can take a moment to read my first comment, number 13, where I > bring the conversation here to determine with systemd developers what the > underlying problem is. This happened right after closing our bug. So, even if there's a bug in the timesyncd code, it is only exposed by your build system. It is pretty obvious from the comments from Kay and others, that it works for him, and for others, and is unlikely to be fixed except by people who are using the build in question. I'd try comparing the disassembly of a failed build with one that works, or maybe run it under gdb and look at the values, or something. I tried to reproduce this on an arm chromebook without success. > If what I can expect is attacks here, the rug will be used. This is > ridiculous. Please don't treat this is as an attack. If your intent is to continue debugging this, then please consider my comment withdrawn. Created attachment 102952 [details]
configure log - console
Created attachment 102953 [details]
configure log - config.log
There were no differences in configure logs between distcc and native, aside from configure correctly adding the distcc path and choosing its gcc over the system's. Both the console output and config.log from the configure invocation are attached. This appears to be an optimization bug right now. Initially using the original build with -O2 (with distcc), and stepping through manager_receive_response in timesyncd.c (the apparent origin of the issue), I noticed many of the variables having been <optimized out> while inspecting through lines 681-687. Of particular interest the receive and trans variables. Rebuilding with -O0 (with distcc) allowed variable inspection and saw everything being set appropriately, and it continued calculating everything fine: interval/delta/delay/jitter/drift 64s/-0.012s/0.035s/0.000s/-45ppm interval/delta/delay/jitter/drift 128s/+0.007s/0.032s/0.003s/-17ppm Running this -O0 version outside of gdb also shows normal operation. The -O0 rebuild seems to have also resolved the systemd-networkd DHCP issue here: https://bugs.freedesktop.org/show_bug.cgi?id=77850 OK, closing. If this turns out not to be a compiler problem after all but a systemd problem, feel free to reopen. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.