[systemd-devel] Upgraded multiple systems to systemd 249.3 and all had eth1 not started / configured

Amish anon.amish at gmail.com
Wed Aug 18 06:12:55 UTC 2021


Hello,

Thank you for your reply.

I can understand that there can be race.

*But when I check logs, there is no race happening*.

*Let us see and analyze the logs.*

Stage 1:
System boots, and kernel assigns eth0, eth1 and eth2 as interface names.

Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: (PCI 
Express:2.5GT/s:Width x1) e0:d5:5e:8d:7f:2f
Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 
Network Connection
Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, 
PBA No: FFFFFF-0FF
Aug 18 09:17:13 kk kernel: 8139too 0000:04:00.0 eth1: RealTek RTL8139 at 
0x000000000e8fc9bb, 00:e0:4d:05:ee:a2, IRQ 19
Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 eth2: RTL8168e/8111e, 
50:3e:aa:05:2b:ca, XID 2c2, IRQ 129
Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 eth2: jumbo features 
[frames: 9194 bytes, tx checksumming: ko]

Stage 2:
Now udev rules are triggered and the interfaces are renamed to tmpeth0, 
tmpeth2 and tmpeth1.

Aug 18 09:17:13 kk kernel: 8139too 0000:04:00.0 tmpeth2: renamed from eth1
Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 tmpeth0: renamed from eth0
Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 tmpeth1: renamed from eth2

Stage 3:
Now my script is called and it renames interfaces to eth0, eth2 and eth1.

Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: renamed from tmpeth0
Aug 18 09:17:14 kk kernel: r8169 0000:02:00.0 eth1: renamed from tmpeth1
Aug 18 09:17:14 kk kernel: 8139too 0000:04:00.0 eth2: renamed from tmpeth2

Effectively original interface eth1 and eth2 are swapped. While eth0 
remains eth0.

All these happened before systemd-networkd started and interface 
renaming was over by 9:17:14.

Stage 4:
Now systemd-networkd starts, 2 seconds after all interface have been 
assigned their final names.

Aug 18 09:17:16 kk systemd[1]: Starting Network Configuration...
Aug 18 09:17:17 kk systemd-networkd[426]: lo: Link UP
Aug 18 09:17:17 kk systemd-networkd[426]: lo: Gained carrier
Aug 18 09:17:17 kk systemd-networkd[426]: Enumeration completed
Aug 18 09:17:17 kk systemd[1]: Started Network Configuration.
Aug 18 09:17:17 kk systemd-networkd[426]: eth2: Interface name change 
detected, renamed to eth1.
Aug 18 09:17:17 kk systemd-networkd[426]: Could not process link 
message: File exists
Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Failed
Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Interface name change 
detected, renamed to eth2.
Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Interface name change 
detected, renamed to tmpeth2.
Aug 18 09:17:17 kk systemd-networkd[426]: eth0: Interface name change 
detected, renamed to tmpeth0.
Aug 18 09:17:17 kk systemd-networkd[426]: eth2: Interface name change 
detected, renamed to tmpeth1.
Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth0: Interface name change 
detected, renamed to eth0.
Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth1: Interface name change 
detected, renamed to eth1.
Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth2: Interface name change 
detected, renamed to eth2.
Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Link UP
Aug 18 09:17:17 kk systemd-networkd[426]: eth0: Link UP
Aug 18 09:17:20 kk systemd-networkd[426]: eth0: Gained carrier

This is when eth0 and eth1 interfaces are up and configured by 
systemd-networkd but eth2 is down and not configured.

*None of the .network configuration files match by interface names. They 
all match just by MAC address.

*# sample .network file.

[Match]
MACAddress=e0:d5:5e:8d:7f:2f
Type=ether

[Network]
IgnoreCarrierLoss=yes
LinkLocalAddressing=no
IPv6AcceptRA=no
ConfigureWithoutCarrier=true
Address=192.168.25.2/24
*
*

Above error message "eth1: failed", was not showing earlier version of 
systemd.

So recent version of systemd-networkd is doing something different and 
this is where something is going wrong.

Stage 5: (my workaround for this issue)
I wrote a new service file which restarts systemd-networkd after waiting 
for 10 seconds.

Aug 18 09:17:27 kk systemd[1]: Stopping Network Configuration...
Aug 18 09:17:27 kk systemd[1]: systemd-networkd.service: Deactivated 
successfully.
Aug 18 09:17:27 kk systemd[1]: Stopped Network Configuration.
Aug 18 09:17:27 kk systemd[1]: Starting Network Configuration...
Aug 18 09:17:27 kk systemd-networkd[579]: eth1: Link UP
Aug 18 09:17:27 kk systemd-networkd[579]: eth0: Link UP
Aug 18 09:17:27 kk systemd-networkd[579]: eth0: Gained carrier
Aug 18 09:17:27 kk systemd-networkd[579]: lo: Link UP
Aug 18 09:17:27 kk systemd-networkd[579]: lo: Gained carrier
Aug 18 09:17:27 kk systemd-networkd[579]: Enumeration completed
Aug 18 09:17:27 kk systemd[1]: Started Network Configuration.
Aug 18 09:17:27 kk systemd-networkd[579]: eth2: Link UP
Aug 18 09:17:27 kk systemd-networkd[579]: eth2: Gained carrier

All interfaces are now up and running as expected.

Please check as I do not believe that this issue is causing any race but 
to me it looks like some logical change in systemd-networkd which is 
causing the issue.

Thank you and regards,

Amish


On 17/08/21 3:18 pm, Colin Guthrie wrote:
> Hiya,
>
> As has been said, this is racy. "Sufficiently early" is just a hope, 
> rather than a guarantee. Perhaps something in the kernel made things 
> more or less efficient (try booting with the old kernel to see if it 
> helps, but as this is a race, it may only work some of the time.). Or 
> perhaps some unit ordering changed so make this better? Perhaps udev 
> settle units have now been dropped and thus the boot is faster and 
> things happen in a more hotplug oriented way? Lot's of possibilities 
> for why this no longer works (and even before it definitely wasn't a 
> guaranteed or recommended approach).
>
> As has been said, you're best to pick a different "namespace" lan0 
> wan0 wan1 etc. if you can but if you can't change this due to some 
> legacy scripts, at least pick sufficiently high ethN numbers to stay 
> out of the way of the kernel, e.g. if you have three eth cards, then 
> pick your names starting from e.g. 5: eth5, eth6, eth7 and thus you 
> can avoid this dance with temporary names (although I'd still 
> recommend using different names altogether if you can).
>
> Hope that helps.
>
> Col
>
> Amish wrote on 16/08/2021 13:38:
>>
>> On 16/08/21 5:39 pm, Lennart Poettering wrote:
>>> On Mo, 16.08.21 17:31, Amish (anon.amish at gmail.com) wrote:
>>>
>>>> On 16/08/21 5:25 pm, Lennart Poettering wrote:
>>>>> On Mo, 16.08.21 16:09, Amish (anon.amish at gmail.com) wrote:
>>>>>
>>>>>> Some old scripts that we have expect interface names starting 
>>>>>> with eth. But
>>>>>> those names are not predictable.
>>>>>>
>>>>>> So to get predictable names starting with eth*, first I 
>>>>>> temporarily rename
>>>>>> all interface with tmpeth*. This is done via udev rules.
>>>>>>
>>>>>> SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:XX",
>>>>>> NAME="tmpeth0"
>>>>>> SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:YY",
>>>>>> NAME="tmpeth1"
>>>>>> SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:ZZ",
>>>>>> NAME="tmpeth2"
>>>>>>
>>>>>> Then I have a small service (script) which runs before 
>>>>>> network-pre.target to
>>>>>> convert these names back to eth*
>>>>>>
>>>>>> #search for network interface with name starting from "tmpeth" 
>>>>>> and rename
>>>>>> them to "eth"
>>>>>> /usr/bin/find /sys/class/net -maxdepth 1 -name "tmpeth[0-9]" 
>>>>>> -type l -printf
>>>>>> "%f\n" | while read tmpiface; do /usr/bin/ip link set dev 
>>>>>> "$tmpiface" name
>>>>>> "$(echo $tmpiface | sed s/tmpeth/eth/)"; done
>>>>>>
>>>>>> This ensures that I have predictable names starting with eth*. 
>>>>>> And it is
>>>>>> working fine from 2-3 years. Even with current issue, name 
>>>>>> assignment is
>>>>>> working fine.
>>>>> This cannot work and is necesarily race. Stay out of the ethXYZ
>>>>> namespace, that's the kernel's namespace. Pick any other names,
>>>>> i.e. "foobar0", "foobar1", but otherwise you just have a racy racy
>>>>> mess, because the kernel might take the name whenever it pleases.
>>>> No I dont think this is race. Because my script runs after Udev has 
>>>> finished
>>>> assigning the interfaces names.
>>> device probing can take any time it wants. there isn't a point in time
>>> where everything is probed.
>>
>> These are internal PCI LAN cards. I believe these gets probed (and 
>> named) sufficiently early.
>>
>> And then we can expect names assigned by Udev to remain same.
>>
>> And I can see in the logs that names are not changed after my script 
>> runs.
>>
>> Also this has been working successfully for me from 2 or more years.
>>
>> But after today's update, something is breaking all the systems.
>>
>> Additionally just now on other system I see eth2 (instead of eth1) 
>> being renamed to eth0.
>>
>> I just want to know what changed and where? (Kernel or Systemd?).
>>
>> *Also another point is, I have set ConfigureWithoutCarrier=yes in 
>> network files and all are static IPs, so systemd-networkd should have 
>> configured the devices even if links are not up. But its not doing 
>> that anymore either after today's update.*
>>
>> Regards
>>
>> Amish.
>>
>>> Lennart
>>>
>>> -- 
>>> Lennart Poettering, Berlin
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20210818/9d92ba7e/attachment.htm>


More information about the systemd-devel mailing list