Created attachment 105271 [details] backtrace Commit 874ff7bf4d6fe693542209f127d23cd89adc499b ("timesyncd: beef up NTP server selection logic, and acquire NTP servers from DHCP") broke systemd-timesyncd when the network is down: # ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp0s25: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000 link/ether 00:21:cc:6b:39:02 brd ff:ff:ff:ff:ff:ff 3: wlp3s0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 08:11:96:0b:b3:28 brd ff:ff:ff:ff:ff:ff # ./systemd-timesyncd Using NTP server 193.204.114.105:123 (193.204.114.105). ASAN:SIGSEGV ================================================================= ==8866==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7fa31ef1ab3d sp 0x7fff51686980 bp 0x7fff51686b90 T0) #0 0x7fa31ef1ab3c in manager_connect src/timesync/timesyncd-manager.c:903 #1 0x7fa31ef315ed in manager_send_request.lto_priv.92 src/timesync/timesyncd-manager.c:203 #2 0x7fa31ef25e88 in manager_begin.lto_priv.90 src/timesync/timesyncd-manager.c:734 #3 0x7fa31ef1c32f in manager_resolve_handler src/timesync/timesyncd-manager.c:818 #4 0x7fa31ef1da6b in res_query_done src/libsystemd/sd-resolve/sd-resolve.c:1265 #5 0x7fa31ef1da6b in complete_query src/libsystemd/sd-resolve/sd-resolve.c:768 #6 0x7fa31ef1e75a in handle_response src/libsystemd/sd-resolve/sd-resolve.c:956 #7 0x7fa31ef1e75a in sd_resolve_process src/libsystemd/sd-resolve/sd-resolve.c:990 #8 0x7fa31ef20950 in io_callback.lto_priv.96 src/libsystemd/sd-resolve/sd-resolve.c:1380 #9 0x7fa31ef154f9 in source_dispatch.lto_priv.60 src/libsystemd/sd-event/sd-event.c:2035 #10 0x7fa31ef2f44f in sd_event_run.constprop.32 src/libsystemd/sd-event/sd-event.c:2333 #11 0x7fa31ef0ef74 in sd_event_loop src/libsystemd/sd-event/sd-event.c:2352 #12 0x7fa31ef0ef74 in main src/timesync/timesyncd.c:143 #13 0x7fa31d132fff in __libc_start_main (/usr/lib/libc.so.6+0x1ffff) #14 0x7fa31ef10ceb (/home/dcoppa/Arch/hacking/systemd/src/systemd-216/systemd-timesyncd+0x13ceb) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV src/timesync/timesyncd-manager.c:903 manager_connect ==8866==ABORTING I've also attached gdb's backtrace.
Created attachment 105272 [details] gdb backtrace (right file) Here's the right file. Sorry for the confusion! David
I've reproduced this on a local machine even when the network is up. It breaks because current_server_name is NULL. I haven't dug into the code deeply enough to figure out why it's NULL, but my DHCP server is supplying an NTP server IP address...
Looking a little bit further. It runs through manager_connect without any issues once. About ten seconds later it calls manager_connect again, and this time the null deref occurs. On the second call, server_name_flush_addresses gets called, which ends up here: (gdb) bt #0 manager_set_server_address (m=0x555555577480, a=a@entry=0x0) at src/timesync/timesyncd-manager.c:758 #1 0x000055555555b4f3 in server_address_free (a=0x555555578bb0) at src/timesync/timesyncd-server.c:62 #2 0x000055555555b810 in server_name_flush_addresses (n=0x5555555788c0) at src/timesync/timesyncd-server.c:150 #3 0x00005555555595fe in manager_connect (m=m@entry=0x555555577480) at src/timesync/timesyncd-manager.c:899 #4 0x000055555555ab0c in manager_timeout (source=<optimized out>, usec=<optimized out>, userdata=0x555555577480) at src/timesync/timesyncd-manager.c:158 #5 0x000055555556125d in source_dispatch (s=0x555555579290) at src/libsystemd/sd-event/sd-event.c:2043 #6 0x0000555555561ad8 in sd_event_run (e=e@entry=0x555555577790, timeout=<optimized out>, timeout@entry=18446744073709551615) at src/libsystemd/sd-event/sd-event.c:2333 #7 0x0000555555561bc1 in sd_event_loop (e=0x555555577790) at src/libsystemd/sd-event/sd-event.c:2352 #8 0x0000555555558639 in main (argc=<optimized out>, argv=<optimized out>) at src/timesync/timesyncd.c:143 The manager_set_server_address sets m->current_server_name to NULL.
Created attachment 105464 [details] [review] possible fix It appears this happens because the NTP server specified is failing to reply. But this shouldn't cause systemd-timesyncd to crash, of course. Possible fix attached. I've tested it with the simple configuration I have on my LAN (broken ntpd, and now fixed ntpd). Seems to work.
Patch looks correct. I added a comment to the code though, to make it more obvious what is going on. http://cgit.freedesktop.org/systemd/systemd/commit/?id=20f8d3cf1b
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.