[systemd-devel] Upgraded multiple systems to systemd 249.3 and all had eth1 not started / configured
Amish
anon.amish at gmail.com
Wed Aug 18 06:54:13 UTC 2021
Hello
Further to my previous email:
I see that there is already an *extremely similar issue* reported on
July 12, 2021 and it has been fixed.
https://github.com/systemd/systemd/issues/20203
But I do not know if this fix exists in systemd v249.3 (Arch Linux)
If it exists that means that fix is breaking my system.
And if it does not exist that means, I can expect it to fix my issue.
Regards,
Amish.
On 18/08/21 11:42 am, Amish wrote:
>
> Hello,
>
> Thank you for your reply.
>
> I can understand that there can be race.
>
> *But when I check logs, there is no race happening*.
>
> *Let us see and analyze the logs.*
>
> Stage 1:
> System boots, and kernel assigns eth0, eth1 and eth2 as interface names.
>
> Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: (PCI
> Express:2.5GT/s:Width x1) e0:d5:5e:8d:7f:2f
> Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000
> Network Connection
> Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12,
> PBA No: FFFFFF-0FF
> Aug 18 09:17:13 kk kernel: 8139too 0000:04:00.0 eth1: RealTek RTL8139
> at 0x000000000e8fc9bb, 00:e0:4d:05:ee:a2, IRQ 19
> Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 eth2: RTL8168e/8111e,
> 50:3e:aa:05:2b:ca, XID 2c2, IRQ 129
> Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 eth2: jumbo features
> [frames: 9194 bytes, tx checksumming: ko]
>
> Stage 2:
> Now udev rules are triggered and the interfaces are renamed to
> tmpeth0, tmpeth2 and tmpeth1.
>
> Aug 18 09:17:13 kk kernel: 8139too 0000:04:00.0 tmpeth2: renamed from eth1
> Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 tmpeth0: renamed from eth0
> Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 tmpeth1: renamed from eth2
>
> Stage 3:
> Now my script is called and it renames interfaces to eth0, eth2 and eth1.
>
> Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: renamed from tmpeth0
> Aug 18 09:17:14 kk kernel: r8169 0000:02:00.0 eth1: renamed from tmpeth1
> Aug 18 09:17:14 kk kernel: 8139too 0000:04:00.0 eth2: renamed from tmpeth2
>
> Effectively original interface eth1 and eth2 are swapped. While eth0
> remains eth0.
>
> All these happened before systemd-networkd started and interface
> renaming was over by 9:17:14.
>
> Stage 4:
> Now systemd-networkd starts, 2 seconds after all interface have been
> assigned their final names.
>
> Aug 18 09:17:16 kk systemd[1]: Starting Network Configuration...
> Aug 18 09:17:17 kk systemd-networkd[426]: lo: Link UP
> Aug 18 09:17:17 kk systemd-networkd[426]: lo: Gained carrier
> Aug 18 09:17:17 kk systemd-networkd[426]: Enumeration completed
> Aug 18 09:17:17 kk systemd[1]: Started Network Configuration.
> Aug 18 09:17:17 kk systemd-networkd[426]: eth2: Interface name change
> detected, renamed to eth1.
> Aug 18 09:17:17 kk systemd-networkd[426]: Could not process link
> message: File exists
> Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Failed
> Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Interface name change
> detected, renamed to eth2.
> Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Interface name change
> detected, renamed to tmpeth2.
> Aug 18 09:17:17 kk systemd-networkd[426]: eth0: Interface name change
> detected, renamed to tmpeth0.
> Aug 18 09:17:17 kk systemd-networkd[426]: eth2: Interface name change
> detected, renamed to tmpeth1.
> Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth0: Interface name
> change detected, renamed to eth0.
> Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth1: Interface name
> change detected, renamed to eth1.
> Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth2: Interface name
> change detected, renamed to eth2.
> Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Link UP
> Aug 18 09:17:17 kk systemd-networkd[426]: eth0: Link UP
> Aug 18 09:17:20 kk systemd-networkd[426]: eth0: Gained carrier
>
> This is when eth0 and eth1 interfaces are up and configured by
> systemd-networkd but eth2 is down and not configured.
>
> *None of the .network configuration files match by interface names.
> They all match just by MAC address.
>
> *# sample .network file.
>
> [Match]
> MACAddress=e0:d5:5e:8d:7f:2f
> Type=ether
>
> [Network]
> IgnoreCarrierLoss=yes
> LinkLocalAddressing=no
> IPv6AcceptRA=no
> ConfigureWithoutCarrier=true
> Address=192.168.25.2/24
> *
> *
>
> Above error message "eth1: failed", was not showing earlier version of
> systemd.
>
> So recent version of systemd-networkd is doing something different and
> this is where something is going wrong.
>
> Stage 5: (my workaround for this issue)
> I wrote a new service file which restarts systemd-networkd after
> waiting for 10 seconds.
>
> Aug 18 09:17:27 kk systemd[1]: Stopping Network Configuration...
> Aug 18 09:17:27 kk systemd[1]: systemd-networkd.service: Deactivated
> successfully.
> Aug 18 09:17:27 kk systemd[1]: Stopped Network Configuration.
> Aug 18 09:17:27 kk systemd[1]: Starting Network Configuration...
> Aug 18 09:17:27 kk systemd-networkd[579]: eth1: Link UP
> Aug 18 09:17:27 kk systemd-networkd[579]: eth0: Link UP
> Aug 18 09:17:27 kk systemd-networkd[579]: eth0: Gained carrier
> Aug 18 09:17:27 kk systemd-networkd[579]: lo: Link UP
> Aug 18 09:17:27 kk systemd-networkd[579]: lo: Gained carrier
> Aug 18 09:17:27 kk systemd-networkd[579]: Enumeration completed
> Aug 18 09:17:27 kk systemd[1]: Started Network Configuration.
> Aug 18 09:17:27 kk systemd-networkd[579]: eth2: Link UP
> Aug 18 09:17:27 kk systemd-networkd[579]: eth2: Gained carrier
>
> All interfaces are now up and running as expected.
>
> Please check as I do not believe that this issue is causing any race
> but to me it looks like some logical change in systemd-networkd which
> is causing the issue.
>
> Thank you and regards,
>
> Amish
>
>
> On 17/08/21 3:18 pm, Colin Guthrie wrote:
>> Hiya,
>>
>> As has been said, this is racy. "Sufficiently early" is just a hope,
>> rather than a guarantee. Perhaps something in the kernel made things
>> more or less efficient (try booting with the old kernel to see if it
>> helps, but as this is a race, it may only work some of the time.). Or
>> perhaps some unit ordering changed so make this better? Perhaps udev
>> settle units have now been dropped and thus the boot is faster and
>> things happen in a more hotplug oriented way? Lot's of possibilities
>> for why this no longer works (and even before it definitely wasn't a
>> guaranteed or recommended approach).
>>
>> As has been said, you're best to pick a different "namespace" lan0
>> wan0 wan1 etc. if you can but if you can't change this due to some
>> legacy scripts, at least pick sufficiently high ethN numbers to stay
>> out of the way of the kernel, e.g. if you have three eth cards, then
>> pick your names starting from e.g. 5: eth5, eth6, eth7 and thus you
>> can avoid this dance with temporary names (although I'd still
>> recommend using different names altogether if you can).
>>
>> Hope that helps.
>>
>> Col
>>
>> Amish wrote on 16/08/2021 13:38:
>>>
>>> On 16/08/21 5:39 pm, Lennart Poettering wrote:
>>>> On Mo, 16.08.21 17:31, Amish (anon.amish at gmail.com) wrote:
>>>>
>>>>> On 16/08/21 5:25 pm, Lennart Poettering wrote:
>>>>>> On Mo, 16.08.21 16:09, Amish (anon.amish at gmail.com) wrote:
>>>>>>
>>>>>>> Some old scripts that we have expect interface names starting
>>>>>>> with eth. But
>>>>>>> those names are not predictable.
>>>>>>>
>>>>>>> So to get predictable names starting with eth*, first I
>>>>>>> temporarily rename
>>>>>>> all interface with tmpeth*. This is done via udev rules.
>>>>>>>
>>>>>>> SUBSYSTEM=="net", ACTION=="add",
>>>>>>> ATTR{address}=="XX:XX:XX:XX:XX:XX",
>>>>>>> NAME="tmpeth0"
>>>>>>> SUBSYSTEM=="net", ACTION=="add",
>>>>>>> ATTR{address}=="XX:XX:XX:XX:XX:YY",
>>>>>>> NAME="tmpeth1"
>>>>>>> SUBSYSTEM=="net", ACTION=="add",
>>>>>>> ATTR{address}=="XX:XX:XX:XX:XX:ZZ",
>>>>>>> NAME="tmpeth2"
>>>>>>>
>>>>>>> Then I have a small service (script) which runs before
>>>>>>> network-pre.target to
>>>>>>> convert these names back to eth*
>>>>>>>
>>>>>>> #search for network interface with name starting from "tmpeth"
>>>>>>> and rename
>>>>>>> them to "eth"
>>>>>>> /usr/bin/find /sys/class/net -maxdepth 1 -name "tmpeth[0-9]"
>>>>>>> -type l -printf
>>>>>>> "%f\n" | while read tmpiface; do /usr/bin/ip link set dev
>>>>>>> "$tmpiface" name
>>>>>>> "$(echo $tmpiface | sed s/tmpeth/eth/)"; done
>>>>>>>
>>>>>>> This ensures that I have predictable names starting with eth*.
>>>>>>> And it is
>>>>>>> working fine from 2-3 years. Even with current issue, name
>>>>>>> assignment is
>>>>>>> working fine.
>>>>>> This cannot work and is necesarily race. Stay out of the ethXYZ
>>>>>> namespace, that's the kernel's namespace. Pick any other names,
>>>>>> i.e. "foobar0", "foobar1", but otherwise you just have a racy racy
>>>>>> mess, because the kernel might take the name whenever it pleases.
>>>>> No I dont think this is race. Because my script runs after Udev
>>>>> has finished
>>>>> assigning the interfaces names.
>>>> device probing can take any time it wants. there isn't a point in time
>>>> where everything is probed.
>>>
>>> These are internal PCI LAN cards. I believe these gets probed (and
>>> named) sufficiently early.
>>>
>>> And then we can expect names assigned by Udev to remain same.
>>>
>>> And I can see in the logs that names are not changed after my script
>>> runs.
>>>
>>> Also this has been working successfully for me from 2 or more years.
>>>
>>> But after today's update, something is breaking all the systems.
>>>
>>> Additionally just now on other system I see eth2 (instead of eth1)
>>> being renamed to eth0.
>>>
>>> I just want to know what changed and where? (Kernel or Systemd?).
>>>
>>> *Also another point is, I have set ConfigureWithoutCarrier=yes in
>>> network files and all are static IPs, so systemd-networkd should
>>> have configured the devices even if links are not up. But its not
>>> doing that anymore either after today's update.*
>>>
>>> Regards
>>>
>>> Amish.
>>>
>>>> Lennart
>>>>
>>>> --
>>>> Lennart Poettering, Berlin
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20210818/1697c369/attachment-0001.htm>
More information about the systemd-devel
mailing list