Bug 90260 - networkd: DHCP lease file gets deleted after carrier is lost then gained again
Summary: networkd: DHCP lease file gets deleted after carrier is lost then gained again
Status: RESOLVED FIXED
Alias: None
Product: systemd
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: All All
: medium normal
Assignee: Tom Gundersen
QA Contact: systemd-bugs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-30 20:59 UTC by Jérémie Detrey
Modified: 2019-02-28 10:37 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Jérémie Detrey 2015-04-30 20:59:54 UTC
Dear all,

I've encountered this bug on systemd 219, but it seems to still be present on the current Git version. On a machine with a wired network connection managed by systemd-networkd using DHCP, unplugging then plugging back the network cable makes the corresponding DHCP lease file in /run/systemd/netif/leases disappear.

The main steps to reproduce are the following:

1. Start systemd-networkd while the machine is connected to the network and wait for the DHCP lease. The networkd log output (with  SYSTEMD_LOG_LEVEL=debug) for the corresponding interface (eth0 here) should read something like:

  eth0 : flags change: +MULTICAST +BROADCAST
  eth0 : link 2 added
  eth0 : udev initialized link
  eth0 : saved original MTU: 1500
  eth0 : link state is up-to-date
  eth0 : found matching network '/etc/systemd/network/eth0.network'
  eth0 : bringing link up
  eth0 : flags change: +UP
  eth0 : flags change: +LOWER_UP +RUNNING
  eth0 : gained carrier
  eth0 : acquiring DHCPv4 lease
  eth0 : Adding address: fe80::225:22ff:fe21:c546/64 (valid for ever)
  eth0 : DHCPv4 address 192.168.10.10/24 via 192.168.10.1
  eth0 : Setting transient hostname: 'stout'
  eth0 : Adding address: 192.168.10.10/24 (valid for 12h)
  eth0 : link configured

Check the link file (/run/systemd/netif/links/2):

  # This is private data. Do not parse.
  ADMIN_STATE=configured
  OPER_STATE=routable
  NETWORK_FILE=/etc/systemd/network/eth0.network
  DNS=192.168.10.1
  NTP=
  DOMAINS=
  WILDCARD_DOMAIN=no
  LLMNR=yes
  DHCP_LEASE=/run/systemd/netif/leases/2

And the lease file (/run/systemd/netif/leases/2):

  # This is private data. Do not parse.
  ADDRESS=192.168.10.10
  NETMASK=255.255.255.0
  ROUTER=192.168.10.1
  SERVER_ADDRESS=192.168.10.1
  NEXT_SERVER=192.168.10.1
  DNS=192.168.10.1
  NTP=
  DOMAINNAME=xxxxx.lan
  HOSTNAME=stout
  CLIENTID=ff897524c100020000ab1156dbe94ec3ad23d8

2. Unplug the network cable. The log reads:

  eth0 : flags change: -LOWER_UP -RUNNING
  eth0 : lost carrier
  eth0 : DHCP lease lost
  eth0 : Setting transient hostname: ''

The lease file hasn't changed, and the link file now has `OPER_STATE=no-carrier' (but the rest of the file is identical).

3. Plug the network cable back in. The log reads:

  eth0 : flags change: +LOWER_UP +RUNNING
  eth0 : gained carrier
  eth0 : acquiring DHCPv4 lease
  eth0 : DHCPv4 address 192.168.10.10/24 via 192.168.10.1
  eth0 : Setting transient hostname: 'stout'
  eth0 : Updating address: 192.168.10.10/24 (valid for 12h)

However, the lease file was removed:

  # ls /run/systemd/netif/leases/2
  ls: cannot access /run/systemd/netif/leases/2: No such file or directory

And even though the link file still mentions the link as configured, it has lost its DNS configuration:

  # This is private data. Do not parse.
  ADMIN_STATE=configured
  OPER_STATE=routable
  NETWORK_FILE=/etc/systemd/network/eth0.network
  DNS=
  NTP=
  DOMAINS=
  WILDCARD_DOMAIN=no
  LLMNR=yes

From a quick look through the source code, I think I've indentified a possible reason for this.

In fact, the link doesn't lose its CONFIGURED state when the carrier is lost. Then, when the cable is plugged back in, the function `link_update_flags' (in src/network/networkd-link.c) first gets called, which in turn calls `link_save'. At this point, the DHCP client was not restarted yet, and `link_save' thus deletes the former lease file, as per l.2313:

        if (link->dhcp_lease) {
                [...]
        } else
                unlink(link->lease_file);

Then the DHCP client is started and eventually obtains a new lease, at which point the function `link_client_handler' gets called. However, this function (l.492-493) reads:

        if (link->state != LINK_STATE_CONFIGURED)
                link_enter_configured(link);

Therefore, `link_enter_configured' (which is the one responsible for calling `link_save' after lease acquisition) never gets called, and the new lease file never gets created.

I don't know which fix should be applied:
- either mark the link as UNMANAGED upon carrier lost, and clean up the obsolete lease file,
- or, alternatively, allow `link_enter_configured' to be called even if the link is already in the CONFIGURED state.

The latter requires just a quick and dirty patch in the code, but the former sounds much more like the behaviour one might expect from networkd.

Kind regards,
Jérémie.
Comment 1 Lennart Poettering 2019-02-28 10:37:15 UTC
DHCP lease lifecycles are nowadays a lot more fine grained configurable. I am pretty sure this issue is fixed now. If it still is reproducible on current systemd versions, please file a new issue in github.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.