[systemd-devel] Networkd doesn't create route for IP in different but connected net with GatewayOnLink= Inbox
Andrei Borzenkov
arvidjaar at gmail.com
Wed Jul 12 08:29:10 UTC 2023
On Wed, Jul 12, 2023 at 10:44 AM LunarLambda <lunarlambda at gmail.com> wrote:
>
> Hello,
>
> I was recently tasked with moving existing network configuration for a machine and some nspawn containers from iupdown to networkd.
>
> The situation looks as follows:
>
> A single VPS has 3 IPs. One 37.x.x.x/22, and two 91.x.x.x/32. The 37-ip is to be routed to the main server, whereas the two 91-ips should be routed directly to nspawn containers running on the server. The server uses systemd 247 and the container uses systemd 252, both Debian.
>
> I created a bridge netdev like so:
>
> [NetDev]
> Name=br0
> Type=bridge
> # Matches physical network card
> MACAddress=AA:BB:CC:DD:EE:FF
>
> Bound the physical ethernet to it like so:
>
> [Match]
> Name=ens3
>
> [Network]
> Bridge=br0
>
> And set up the main IP for the bridge like so:
>
> [Match]
> Name=br0
>
> [Network]
> DNS=...
> DNS=...
> Address=37.x.x.x/22
> Gateway=37.x.x.1
>
> The nspawn containers are added to the bridge via
>
> [Network]
> Bridge=br0
>
> Up until this point everything works. However, configuring networking between the host and containers proved quite difficult and I'm unsure whether I'm doing something wrong or networkd is.
>
> What I tried was the following, inside the container:
>
> [Match]
> Virtualization=container
> Name=host0
>
> [Address]
> Address=91.x.x.x/32
>
> [Route]
> Gateway=37.x.x.x
> GatewayOnLink=true
>
> However, this did not create any usable routes to the host, nor did it throw any errors in the journal. Pinging the host does not work.
>
> Manually creating the routes with ip route did work:
>
> ip r add 37.x.x.x dev host0 onlink
> ip r add default dev host0 via 37.x.x.x
>
> I tried a variety of different combinations of options in the .network file, Scope, Type, etc...
>
> The only thing that successfully created any routes was the following:
>
> [Match]
> Virtualization=container
> Name=host0
>
> [Address]
> Address=91.x.x.x/32
> Peer=37.x.x.x/32
>
> [Network]
> Gateway=37.x.x.x
>
> This strikes me as odd because nowhere in the documentation, nor in any online searching could I find this described as necessary (beyond the manpage mentioning that Peer= exists)
>
How is your Linux container supposed to know that to reach host
37.x.x.x it needs to send a packet via interface with address
91.x.x.x? That is not how Linux routing normally works. You must have
a routing entry that tells kernel how to forward packet and assigning
address 91.x.x.x to your interface does not magically create any route
entry to the network 37.x.x.x. Adding a peer address is one
possibility which does it. Another possibility is creating the
necessary routes manually like you did.
> On the host side, I thought the bridge device, acting on Layer 2, would automatically figure out routes to the containers (via ARP),
Bridge (physical or virtual) has nothing to do with routing, it is
only using MAC addresses. ARP is used by the kernel to find out the L2
address for the destination L3 address which is on the broadcast
network. It happens way after the routing decision was already made.
So the kernel needs to know that network 37.x.x.x is directly
reachable on the broadcast segment to which the interface is connected
before the kernel even attempts ARP. That is exactly what your "ip r
add 37.x.x.x dev host0 onlink" does. Alternative way is specifying a
peer address which implicitly creates a similar routing entry (and
peer can be the whole network).
> or that nspawn and networkd would interact in some way to add routes. However, this didn't seem to happen, so I also had to add the following to the bridge's .network file:
>
> [Route]
> Source=37.x.x.x
> Destination=91.x.x.A
>
> [Route]
> Source=37.x.x.x
> Destination=91.x.x.B
>
Same as above. Host must know how to forward packets to the addresses
91.x.x.x and without routing entries nothing will tell the host how to
do it. Routing is bidirectional; a container knowing how to forward
traffic to the host does not automatically imply that the host knows
how to forward traffic to the container.
> With all of this, everything works fine now. However, since the routes, both host-to-container and container-to-host, differ somewhat from the old (also working) setup,
Your working setup must have created the same routing entries because
otherwise it would not work. Care to show your old configuration?
> and some of the steps necessary I could not find described anywhere, I am left wondering if I fundamentally misunderstand something about how Linux networking works, or if perhaps networkd is behaving oddly because of the IP addresses being considered in different networks.
You misunderstand how IP networking works. Nothing in your description
is Linux specific.
>
> I would love to find a conclusive answer to this, especially because I want to make sure I understood the fundamental concepts involved correctly.
More information about the systemd-devel
mailing list