[systemd-devel] systemd networking : problems with bridges

Charles Devereaux systemd at guylhem.net
Sun Jan 11 15:18:42 PST 2015


(oops, accidentally replied off-list. Reposting)

Hello

On Sat, Jan 3, 2015 at 6:58 AM, Andrei Borzenkov <arvidjaar at gmail.com>
wrote:

> So if I understand it correctly you want
>
> - configure one interface
> - start some program that establishes what you call "uplink". I presume
>   it results in one more interface appearing?
> - configure new "uplink" interface
>
> Am I right? Could you explain in some more details your setup?


You are right. The interface wlan0 is configured, and provides internet
connectivity to a machine having a br0 bridge.

br0 is bridging ap0 (a local access point) and vpn0 (uplink) so that
clients connected on this access point access the vpn transparently.

br0 should be configured automatically, along with the others - it should
just not block wait-online from saying the system is online

I saw an interesting discussion of the problem on
http://www.spinics.net/lists/netdev/msg172204.html

I agree with the OP conclusion
http://www.spinics.net/lists/netdev/msg174826.html
"The situation of a bridge is quite different from a physical ethernet
interface. The physical interface is usually connected to a switch and is
thus immediately up, even if no systems other than the switch are online.
In the case of a bridge on a virtualization host used to connect virtual
machines, the bridge only goes up after the first VM was started. In the
IPv6 SLAAC case, the host tries start its radvd on the bridge well before
the first VM comes up, the radvd barfs since the interface is not up, does
not come up, and the VMs are without network. In this case, it is needed to
force the bridge into an UP state earlier so that radvd can start."

My situation is indeed similar to setting up a bridge for virtual machines
only. Like lo, such bridges should be ignored by
systemd-networkd-wait-online under some conditions.

What about another option in networkd-wait-online-link.c to pass an
"ignore" parameter, along with a list of interfaces names, so that
link_relevant does: something like:

        if (l->flags & IFF_LOOPBACK)
                return false;

        if (STR_IN_SET(l->ifname, arg_ignore)) {
                return false;

At the moment, my dirty workaround is to have br0 hardcoded  in
manager_all_configured from src/network/networkd-wait-online-manager.c:

if (!link_relevant(l) || STR_IN_SET(l->ifname, "br0")) {


> Yes, systemd-networkd-online will wait for all known links to become
> ready.
>

It should have an option to ignore some, as not all links are relevant. It
would be the opposite of -i, which specifies which are important (ex: here
it can be either wlan0 or eth0 depending on the situation, while br0 is
always to be ignored)

$ ip address show
(...)
8: br0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state
DOWN group default

 I have to find a way to make br0 appear "ready".


> This sounds like broken approach. If I follow your configuration
> correctly, br0 is *the* interface that provides your network
> connectivity so wait-online should wait until it is up and running.
>

Not exactly - wlan0 or eth0 provide network connectivity to the machine.
br0 (ap0 +vpn0) are for other computers


> What is missing here is the ability to express dependency on individual
> interfaces. Alternatively support for callouts would help (start
> external helper to configure uplink as soon as physical interface is
> ready).


Indeed

I think it was discussed and for now it is intentional - you need to
> restart networkd to pick configuration changes but networkd will not
> wipe clean existing configuration.
>

Too bad. That would also be an interesting option. At the moment there is
no way to automatically clean what networkd has done.

It is up to client to renew address. May be networkd could optionally
> set very short lease time to force renewal.
>

That would be a great addition. It would fix the problem.

Alternatively, some control allowing to stop the DHCP server on this
interface would help.

In case the suggested patch to systemd-networkd-wait-online isn't accepted,
I investigated an alternative : adding a dummy0 to br0 as suggested by the
link above.

It is indeed better : br0 then has a carrier.

However in networkctl it is still considered unconfigured:

$ networkctl:
IDX LINK             TYPE               OPERATIONAL SETUP
(...)
  4 br0              ether              routable    configuring

I tested that futher by adding a br1 with a similar setup, only without any
DHCP

[Match]
Name=br1

[Network]
Address=192.168.200.224/28

$ networkctl:
IDX LINK             TYPE               OPERATIONAL SETUP
(...)
  5 br1              ether              no-carrier  configured

This seems very similar to the bug reported before on
https://www.mail-archive.com/systemd-devel@lists.freedesktop.org/msg22414.html

Basically, the DHCP commands prevent considering the bridge online. This
would be logical if there was only a DHCP section, but if someone also
configured a fixed IP, it should inhibit this wrong behaviour.

I traced the problem to link_client_handler in src/network/networkd-link.c :
if (link_dhcp4_enabled(link) && !link->dhcp4_configured) return;

This can't just  be commented out, because it's necessary for interfaces
using only DHCP.
Maybe it could become conditionnal to the interface not having an ip
address? That could be cleaner that adding the oppositive of the -i option.

Also, for interfaces added to a bridge but without ip (ex: ap0 from
hostap), listing the service as "degraded" seems wrong.

$ networkctl:
IDX LINK             TYPE               OPERATIONAL SETUP
(...)
  5 ap0              wlan               degraded    configured

Being part of the bridge is their normal setup. They can't have an IP
address.

PS: I've noticed this message in my logs, without much more details:
systemd-resolved : Assertion 'n > 0' failed at
src/resolve/resolved-dns-answer.c:28, function dns_answer_new(). Aborting

It does not seem like a known problem, and I don't see why it would happen.
It looks like there were many DNS replies on the same link

Charles
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20150111/67f8e342/attachment-0001.html>


More information about the systemd-devel mailing list