CoreOS related bug filing:
Using cloud-init to setup network and netdev units for bonded interfaces does not work as expected, since networkd does not ifdown interfaces that it did not setup.
This means that a physical system with 2 nics that is netbooted with a plain CoreOS image will have those interfaces setup during boot automatically by udev(?) and when cloud-init sets up a bond and then restarts networkd, networkd, does not ifdown eno1 and eno2, which means they can not be enslaved to the bond and things fail to come up properly.
networkd could likely continue to ignore devices it knows nothing about, but assume that it is safe to down interfaces that it now has configuration for, even if it didn't initially bring them up.
A bit more documentation:
It also appears that if the bond is setup to use DHCP, that it doesn't wait long enough for link to be established after the interface is upped before requesting DHCP, and therefore the bond has no IP. A second netword restart works fine, since the interfaces are already up. (Not completely sure if this is a networkd or linux dist issue)
Hm, we probably should try to be more clever about this. However, could you explain how come the interfaces are up in the first place. This sounds strange?
Sorry for the delay, I never got a notification of your reply.
I am pretty sure that the initial setup is done via udev.
I'm also impacted by this bug.
I don't think that the reasons for which the interfaces are up matters here. Take the case when someone configures an interface manually and then decides to use networkd, I think that case should be handled.
I'm not sure if networkd should set down the interfaces for which it has a new config when stopping or if it should ensure when starting that all the interfaces it will be configuring are down. But doing nothing about these interfaces and failing doesn't seem the good solution.
To add a little more color to this issue, that I am also affected by, it is very common to have the interfaces up when attempting to apply a new configuration, many times it is the only way.
For example, a common server provisioning workflow might be:
DHCP single interface
tftp and pxeboot into the OS
Cloud-init writes the .network and .netdev files to the appropriate locations
restart networking to come up with bonding
With this current issue the network restart doesn't create the bond as desired.
For now, as a workaround, I'm adding a tiny bash script that does this during the provisioning sequence, but would love to see a fix to this in the future. Thanks.