<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hello</p>
<p>Further to my previous email:</p>
<p>I see that there is already an <b>extremely similar issue</b>
reported on July 12, 2021 and it has been fixed.<br>
<br>
<a class="moz-txt-link-freetext" href="https://github.com/systemd/systemd/issues/20203">https://github.com/systemd/systemd/issues/20203</a><br>
<br>
But I do not know if this fix exists in systemd v249.3 (Arch
Linux)<br>
<br>
If it exists that means that fix is breaking my system.<br>
And if it does not exist that means, I can expect it to fix my
issue.<br>
</p>
<p>Regards,</p>
<p>Amish.<br>
</p>
<div class="moz-cite-prefix">On 18/08/21 11:42 am, Amish wrote:<br>
</div>
<blockquote type="cite"
cite="mid:1043b8c7-a453-ad5e-579a-8721a7fbcfd5@gmail.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<p>Hello,</p>
<p>Thank you for your reply.<br>
</p>
<p>I can understand that there can be race.</p>
<p><b>But when I check logs, there is no race happening</b>.<br>
</p>
<p><b>Let us see and analyze the logs.</b></p>
<p>Stage 1:<br>
System boots, and kernel assigns eth0, eth1 and eth2 as
interface names.<br>
</p>
<p>Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: (PCI
Express:2.5GT/s:Width x1) e0:d5:5e:8d:7f:2f<br>
Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: Intel(R)
PRO/1000 Network Connection<br>
Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: MAC: 13,
PHY: 12, PBA No: FFFFFF-0FF<br>
Aug 18 09:17:13 kk kernel: 8139too 0000:04:00.0 eth1: RealTek
RTL8139 at 0x000000000e8fc9bb, 00:e0:4d:05:ee:a2, IRQ 19<br>
Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 eth2:
RTL8168e/8111e, 50:3e:aa:05:2b:ca, XID 2c2, IRQ 129<br>
Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 eth2: jumbo
features [frames: 9194 bytes, tx checksumming: ko]</p>
<p>Stage 2:<br>
Now udev rules are triggered and the interfaces are renamed to
tmpeth0, tmpeth2 and tmpeth1.<br>
</p>
<p>Aug 18 09:17:13 kk kernel: 8139too 0000:04:00.0 tmpeth2:
renamed from eth1<br>
Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 tmpeth0: renamed
from eth0<br>
Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 tmpeth1: renamed
from eth2</p>
<p>Stage 3:<br>
Now my script is called and it renames interfaces to eth0, eth2
and eth1.<br>
</p>
<p>Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: renamed
from tmpeth0<br>
Aug 18 09:17:14 kk kernel: r8169 0000:02:00.0 eth1: renamed from
tmpeth1<br>
Aug 18 09:17:14 kk kernel: 8139too 0000:04:00.0 eth2: renamed
from tmpeth2</p>
<p>Effectively original interface eth1 and eth2 are swapped. While
eth0 remains eth0.<br>
</p>
<p>All these happened before systemd-networkd started and
interface renaming was over by 9:17:14.<br>
</p>
<p>Stage 4:<br>
Now systemd-networkd starts, 2 seconds after all interface have
been assigned their final names.</p>
<p>Aug 18 09:17:16 kk systemd[1]: Starting Network
Configuration...<br>
Aug 18 09:17:17 kk systemd-networkd[426]: lo: Link UP<br>
Aug 18 09:17:17 kk systemd-networkd[426]: lo: Gained carrier<br>
Aug 18 09:17:17 kk systemd-networkd[426]: Enumeration completed<br>
Aug 18 09:17:17 kk systemd[1]: Started Network Configuration.<br>
Aug 18 09:17:17 kk systemd-networkd[426]: eth2: Interface name
change detected, renamed to eth1.<br>
Aug 18 09:17:17 kk systemd-networkd[426]: Could not process link
message: File exists<br>
Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Failed<br>
Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Interface name
change detected, renamed to eth2.<br>
Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Interface name
change detected, renamed to tmpeth2.<br>
Aug 18 09:17:17 kk systemd-networkd[426]: eth0: Interface name
change detected, renamed to tmpeth0.<br>
Aug 18 09:17:17 kk systemd-networkd[426]: eth2: Interface name
change detected, renamed to tmpeth1.<br>
Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth0: Interface
name change detected, renamed to eth0.<br>
Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth1: Interface
name change detected, renamed to eth1.<br>
Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth2: Interface
name change detected, renamed to eth2.<br>
Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Link UP<br>
Aug 18 09:17:17 kk systemd-networkd[426]: eth0: Link UP<br>
Aug 18 09:17:20 kk systemd-networkd[426]: eth0: Gained carrier<br>
</p>
<p>This is when eth0 and eth1 interfaces are up and configured by
systemd-networkd but eth2 is down and not configured.</p>
<p><b>None of the .network configuration files match by interface
names. They all match just by MAC address.<br>
<br>
</b># sample .network file.<br>
</p>
<p>[Match]<br>
MACAddress=e0:d5:5e:8d:7f:2f<br>
Type=ether<br>
<br>
[Network]<br>
IgnoreCarrierLoss=yes<br>
LinkLocalAddressing=no<br>
IPv6AcceptRA=no<br>
ConfigureWithoutCarrier=true<br>
Address=192.168.25.2/24<br>
<b><br>
</b></p>
<p>Above error message "eth1: failed", was not showing earlier
version of systemd.</p>
<p>So recent version of systemd-networkd is doing something
different and this is where something is going wrong.<br>
<br>
</p>
<p>Stage 5: (my workaround for this issue)<br>
I wrote a new service file which restarts systemd-networkd after
waiting for 10 seconds.</p>
<p>Aug 18 09:17:27 kk systemd[1]: Stopping Network
Configuration...<br>
Aug 18 09:17:27 kk systemd[1]: systemd-networkd.service:
Deactivated successfully.<br>
Aug 18 09:17:27 kk systemd[1]: Stopped Network Configuration.<br>
Aug 18 09:17:27 kk systemd[1]: Starting Network Configuration...<br>
Aug 18 09:17:27 kk systemd-networkd[579]: eth1: Link UP<br>
Aug 18 09:17:27 kk systemd-networkd[579]: eth0: Link UP<br>
Aug 18 09:17:27 kk systemd-networkd[579]: eth0: Gained carrier<br>
Aug 18 09:17:27 kk systemd-networkd[579]: lo: Link UP<br>
Aug 18 09:17:27 kk systemd-networkd[579]: lo: Gained carrier<br>
Aug 18 09:17:27 kk systemd-networkd[579]: Enumeration completed<br>
Aug 18 09:17:27 kk systemd[1]: Started Network Configuration.<br>
Aug 18 09:17:27 kk systemd-networkd[579]: eth2: Link UP<br>
Aug 18 09:17:27 kk systemd-networkd[579]: eth2: Gained carrier<br>
<br>
All interfaces are now up and running as expected.<br>
</p>
<p>Please check as I do not believe that this issue is causing any
race but to me it looks like some logical change in
systemd-networkd which is causing the issue.<br>
</p>
<p>Thank you and regards,<br>
</p>
<p>Amish</p>
<br>
<div class="moz-cite-prefix">On 17/08/21 3:18 pm, Colin Guthrie
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:5b42e102-4fc0-bc8f-4f11-fb4edcdba26b@colin.guthr.ie">Hiya,
<br>
<br>
As has been said, this is racy. "Sufficiently early" is just a
hope, rather than a guarantee. Perhaps something in the kernel
made things more or less efficient (try booting with the old
kernel to see if it helps, but as this is a race, it may only
work some of the time.). Or perhaps some unit ordering changed
so make this better? Perhaps udev settle units have now been
dropped and thus the boot is faster and things happen in a more
hotplug oriented way? Lot's of possibilities for why this no
longer works (and even before it definitely wasn't a guaranteed
or recommended approach). <br>
<br>
As has been said, you're best to pick a different "namespace"
lan0 wan0 wan1 etc. if you can but if you can't change this due
to some legacy scripts, at least pick sufficiently high ethN
numbers to stay out of the way of the kernel, e.g. if you have
three eth cards, then pick your names starting from e.g. 5:
eth5, eth6, eth7 and thus you can avoid this dance with
temporary names (although I'd still recommend using different
names altogether if you can). <br>
<br>
Hope that helps. <br>
<br>
Col <br>
<br>
Amish wrote on 16/08/2021 13:38: <br>
<blockquote type="cite"> <br>
On 16/08/21 5:39 pm, Lennart Poettering wrote: <br>
<blockquote type="cite">On Mo, 16.08.21 17:31, Amish (<a
class="moz-txt-link-abbreviated"
href="mailto:anon.amish@gmail.com" moz-do-not-send="true">anon.amish@gmail.com</a>)
wrote: <br>
<br>
<blockquote type="cite">On 16/08/21 5:25 pm, Lennart
Poettering wrote: <br>
<blockquote type="cite">On Mo, 16.08.21 16:09, Amish (<a
class="moz-txt-link-abbreviated"
href="mailto:anon.amish@gmail.com"
moz-do-not-send="true">anon.amish@gmail.com</a>)
wrote: <br>
<br>
<blockquote type="cite">Some old scripts that we have
expect interface names starting with eth. But <br>
those names are not predictable. <br>
<br>
So to get predictable names starting with eth*, first
I temporarily rename <br>
all interface with tmpeth*. This is done via udev
rules. <br>
<br>
SUBSYSTEM=="net", ACTION=="add",
ATTR{address}=="XX:XX:XX:XX:XX:XX", <br>
NAME="tmpeth0" <br>
SUBSYSTEM=="net", ACTION=="add",
ATTR{address}=="XX:XX:XX:XX:XX:YY", <br>
NAME="tmpeth1" <br>
SUBSYSTEM=="net", ACTION=="add",
ATTR{address}=="XX:XX:XX:XX:XX:ZZ", <br>
NAME="tmpeth2" <br>
<br>
Then I have a small service (script) which runs before
network-pre.target to <br>
convert these names back to eth* <br>
<br>
#search for network interface with name starting from
"tmpeth" and rename <br>
them to "eth" <br>
/usr/bin/find /sys/class/net -maxdepth 1 -name
"tmpeth[0-9]" -type l -printf <br>
"%f\n" | while read tmpiface; do /usr/bin/ip link set
dev "$tmpiface" name <br>
"$(echo $tmpiface | sed s/tmpeth/eth/)"; done <br>
<br>
This ensures that I have predictable names starting
with eth*. And it is <br>
working fine from 2-3 years. Even with current issue,
name assignment is <br>
working fine. <br>
</blockquote>
This cannot work and is necesarily race. Stay out of the
ethXYZ <br>
namespace, that's the kernel's namespace. Pick any other
names, <br>
i.e. "foobar0", "foobar1", but otherwise you just have a
racy racy <br>
mess, because the kernel might take the name whenever it
pleases. <br>
</blockquote>
No I dont think this is race. Because my script runs after
Udev has finished <br>
assigning the interfaces names. <br>
</blockquote>
device probing can take any time it wants. there isn't a
point in time <br>
where everything is probed. <br>
</blockquote>
<br>
These are internal PCI LAN cards. I believe these gets probed
(and named) sufficiently early. <br>
<br>
And then we can expect names assigned by Udev to remain same.
<br>
<br>
And I can see in the logs that names are not changed after my
script runs. <br>
<br>
Also this has been working successfully for me from 2 or more
years. <br>
<br>
But after today's update, something is breaking all the
systems. <br>
<br>
Additionally just now on other system I see eth2 (instead of
eth1) being renamed to eth0. <br>
<br>
I just want to know what changed and where? (Kernel or
Systemd?). <br>
<br>
*Also another point is, I have set ConfigureWithoutCarrier=yes
in network files and all are static IPs, so systemd-networkd
should have configured the devices even if links are not up.
But its not doing that anymore either after today's update.* <br>
<br>
Regards <br>
<br>
Amish. <br>
<br>
<blockquote type="cite">Lennart <br>
<br>
-- <br>
Lennart Poettering, Berlin <br>
</blockquote>
</blockquote>
<br>
<br>
</blockquote>
</blockquote>
</body>
</html>