Bug 56929 - udev makes network interfaces available before it has renamed them
udev makes network interfaces available before it has renamed them
Status: RESOLVED INVALID
Product: systemd
Classification: Unclassified
Component: general
unspecified
Other All
: medium normal
Assigned To: systemd-bugs
systemd-bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-09 17:35 UTC by Daniel Drake
Modified: 2013-05-16 16:25 UTC (History)
1 user (show)

See Also:


Attachments
full journald log (77.21 KB, text/plain)
2012-11-09 17:35 UTC, Daniel Drake
Details
journalctl output (26.48 KB, text/plain)
2012-11-11 14:16 UTC, Marcos Mello
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Drake 2012-11-09 17:35:59 UTC
Created attachment 69830 [details]
full journald log

Fedora 18 (systemd-195-2.fc18) running on OLPC XO laptops.

We ship a udev rule that renames eth* network interfaces when they are found.

   KERNEL=="eth*", PROGRAM="olpc_eth_namer", NAME="%c"

This rule is no longer working, the renames fail with error:

   systemd-udevd[258]: error changing net interface name eth0 to eth1: Device or resource busy

The journalctl logs (attached) show that NetworkManager is activating the device before udev rules have run. Here is the trimmed sequence of events:

First the device appears and NM starts doing stuff with it, including bringing the interface up:

Nov 09 16:38:37 xo-93-20-8d.localdomain kernel: asix 1-1.2:1.0: eth0: register 'asix' at usb-d4208000.usb-1.2, ASIX AX88772 USB 2.0 Ethernet, 00:1c:49:01:05:e9
Nov 09 16:38:38 xo-93-20-8d.localdomain NetworkManager[371]: <info> (eth0): carrier is OFF
Nov 09 16:38:38 xo-93-20-8d.localdomain NetworkManager[371]: <error> [1352479118.338790] [nm-device-ethernet.c:454] update_permanent_hw_address(): (eth0): unable to read permanent MAC address (error 0)
Nov 09 16:38:38 xo-93-20-8d.localdomain NetworkManager[371]: <info> (eth0): new Ethernet device (driver: 'asix' ifindex: 2)
Nov 09 16:38:39 xo-93-20-8d.localdomain kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Nov 09 16:38:39 xo-93-20-8d.localdomain NetworkManager[371]: <info> (eth0): preparing device.

Then my udev rule PROGRAM gets run which outputs these lines to /dev/kmsg:

Nov 09 16:38:39 xo-93-20-8d.localdomain : eth namer
Nov 09 16:38:39 xo-93-20-8d.localdomain : eth namer eth1

Its trying to rename the new device to eth1. But that then fails, because the network interface is up.

Nov 09 16:38:40 xo-93-20-8d.localdomain systemd-udevd[258]: error changing net interface name eth0 to eth1: Device or resource busy


I checked the NM code, it uses libgudev to become aware of new network devices, and ones that are available at startup. So this seems like a udev bug - it should not be advertising these devices to libgudev clients before udev itself has finished applying the rule-driven configuration.
Comment 1 Marcos Mello 2012-11-11 14:13:03 UTC
Similar problem here (Fedora 18, systemd-195-2.fc18.x86_64). I have an old machine with two network adapters and a 70-persistent-net.rules as follows:

ACTION=="add", SUBSYSTEM=="net", ATTR{address}=="XX:XX:XX:XX:XX:XX", NAME="eth0"
ACTION=="add", SUBSYSTEM=="net", ATTR{address}=="YY:YY:YY:YY:YY:YY", NAME="eth1"

These rules do not work anymore. The rename, when it is needed, fails.

systemd-udevd[282]: error changing net interface name eth0 to eth1: File exists
systemd-udevd[284]: error changing net interface name eth1 to eth0: File exists

Lennart commented on fedora-devel that we should use biosdevname instead. I would be happy using it, but unfortunately this BIOS does not support SMBIOS 2.6, so I need the rules working.

journalctl output attached.
Comment 2 Marcos Mello 2012-11-11 14:16:27 UTC
Created attachment 69901 [details]
journalctl output
Comment 3 Kay Sievers 2012-11-11 14:38:54 UTC
Short version:
The rule will not work any more, and cannot be made working with systemd.
The rules need to use names now which do not use kernel names like ethX.

Biosdevname *should* work fine, even without "BIOS support" in that
sense. It should be able to calculate a predictable name based
on the physical location of the hardware, at least if PCI/USB hardware
is used.

Long version:
We do no longer support renaming network interfaces in the kernel
namespace. Interface names are required to use custom names that
can never clash with the kernel created ones.

We do not support swapping names; we cannot win any race against
the kernel creating new interfaces at the same time.

We do no longer support the creation of udev rules from inside the
hotplug path.

It was pretty naive to ever try this in the first place, it all is a
problem that cannot be solved properly, and which creates many more
new problems than it solves. The entire udev-based automatic persistent
network names is all just a long history of failures, it pretended to
be able to solve something; but it couldn't deliver. We completely
stopped pretending that now, and need to move on to something that
can work out in a reliable and predictable manner.

Predictable network interface names require a tool like biosdevname,
or manually configured names, which do not use the kernel names.
Comment 4 Marcos Mello 2012-11-11 18:09:21 UTC
Well, it does not work here:

# biosdevname -d
BIOS device:
Kernel name: eth1
Permanent MAC: 00:16:76:8C:E9:04
Assigned MAC : 00:16:76:8C:E9:04
ifIndex: 3
Driver: 8139too
Driver version: 0.9.28
Firmware version:
Bus Info: 0000:02:02.0
PCI name      : 0000:02:02.0
PCI Slot      : Unknown
Index in slot: 0

BIOS device:
Kernel name: eth0
Permanent MAC: 1C:7E:E5:26:32:3A
Assigned MAC : 1C:7E:E5:26:32:3A
ifIndex: 2
Driver: r8169
Driver version: 2.3LK-NAPI
Firmware version:
Bus Info: 0000:02:03.0
PCI name      : 0000:02:03.0
PCI Slot      : Unknown
Index in slot: 0

biosdevname bug?
Comment 5 Marcos Mello 2012-11-11 18:57:58 UTC
So probably I have a crappy system like this:
https://bugzilla.redhat.com/show_bug.cgi?id=673267

For now I will use names that do not conflict with the ones used by the kernel.

Thanks for the tip Kay.
Comment 6 Daniel Drake 2012-11-12 14:19:25 UTC
I imagine you have probably resolved Marcos's case above. But the original issue reported when I opened the bug still stands, and is not related to the target name being used. (I updated the rules to rename the device to foo1 to remove any doubt - still doesn't work)

In my case, the issue is that udev presents the device to NetworkManager before it has applied the relevant udev rules. NetworkManager immediately brings the device up (e.g. ifconfig eth0 up) which then means when udev tries to rename it shortly after, it fails, because you can't rename an interface that is up. This is reproducible on every boot.
Comment 7 Daniel Drake 2012-11-26 17:33:12 UTC
I've investigated further:

On startup, NetworkManager starts listening for uevents via libgudev (to be informed of new network devices that get added later), and then enumerates all existing network devices:

	devices = g_udev_client_query_by_subsystem (priv->client, "net");
	for (iter = devices; iter; iter = g_list_next (iter)) {
		net_add (self, G_UDEV_DEVICE (iter->data));

This is hitting a race. libudev's enumeration works directly with /sys, without consulting udevd. In this case, udevd has not finished reading/applying all the relevant rules to the device, but libudev finds it anyway, and hands it over to NetworkManager.

I added some debug messages in NM and udevd and can confirm that the following is happening:

1. system boots
2. Network device is detected
3. NetworkManager starts, queries available network devices, finds the device
4. NetworkManager brings the network device up
5. udevd starts processing of the network device
6. udevd tries to rename the network device, fails, its in use
7. udevd announces presence of the network device to libudev listeners

I can't imagine this is the only race caused by the fact that libudev's enumeration doesn't seem to synchronise with udevd before presenting devices. What options do we have to solve this?
Comment 9 Daniel Drake 2012-11-26 18:58:42 UTC
Ah, that looks promising.

So, the NM enumeration part would start checking udev_device_get_is_initialized() before processing devices. If that returns 0, it would skip the device, on the basis that udev is still setting it up, and we should expect it to arrive via a uevent later.

Does that logic sound sensible?
Comment 10 Kay Sievers 2012-11-26 20:04:26 UTC
Could be, that this is enough:
  g_udev_enumerator_add_match_is_initialized()
Comment 11 Daniel Drake 2012-11-26 20:10:46 UTC
Yes, and also the udev_enumerate_add_match_is_initialized() documentation agrees with the logic above.

https://mail.gnome.org/archives/networkmanager-list/2012-November/msg00175.html

Seems to have solved the problem. Glad that there wasn't a gaping hole after all. Thanks!