[systemd-devel] One of my fundamental problems with systemd...

Kok, Auke-jan H auke-jan.h.kok at intel.com
Fri Oct 26 16:51:28 PDT 2012


On Fri, Oct 26, 2012 at 3:39 PM, Michael H. Warfield <mhw at wittsend.com> wrote:
> My most fundamental problem with systemd is its insistence in hiding and
> obfuscating errors in ways that makes debugging almost impossible.
> Almost every upgrade problem I've had in Fedora has been related to
> systemd's failure to provide comprehendable error messages to things
> like errors in fstab (#1 fsck up).
>
> The most recent problem has been an issue trying to get LXC containers
> to work.  The networking is not coming up in the container at boot.
> It's not starting.  What do I get?  I finally dug it out of the console
> barf.  A message that says this:
>
>  [FAILED] Failed to start LSB: Bring up/down networking.
>          See 'systemctl status network.service' for details.
>
> Ok fine...  So I get logged in and I run this...
>
> [root at alcove mhw]# systemctl status network.service
> network.service - LSB: Bring up/down networking
>           Loaded: loaded (/etc/rc.d/init.d/network)
>           Active: failed (Result: exit-code) since Wed, 24 Oct 2012 18:23:07 +0400; 1min 57s ago
>          Process: 91 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=209/STDOUT)
>           CGroup: name=systemd:/system/lxc/Alcove/system/network.service
>
> Tells me nothing.  Does not tell me where the problem is...

On the contrary, it does quite clearly indicate where the problem is.

The information may be represented differently than you expect, but it
is clear on the status of the service - including the return value of
the execution.

> If I then try to manually start the network I get this...
>
> [root at alcove mhw]# service network start
> Starting network (via systemctl):  network[275]: Bringing up loopback interface:  ./network-functions: line 237: cd: /var/run/netreport: No such file or directory
> network[275]: [  OK  ]
> network[275]: Bringing up interface eth0:  ./network-functions: line 237: cd: /var/run/netreport: No such file or directory
> network[275]: [  OK  ]
> network[275]: touch: cannot touch `/var/lock/subsys/network': No such file or directory
>                                                            [  OK  ]
> OK...  This I can understand.  There are missing directories in /var/run
> and in /var/lock.  Don't tell me how that script should have done this
> or that or the other.  That's NOT the problem.  The problem is that
> systemd did not communicate back WHAT THE REAL PROBLEM WAS.  Why is it
> so difficult for systemd to respond with intelligent error message???
> The message said to run "systemctl status network.service" but that
> result was worthless.
>
> I'll now edit that startup script to fix this nonsense but it's pointing
> to a fundamental failure in systemd in communicating errors back to
> administrators in an actionable manner.

Try and be reasonable here. Here's what I read in your message:

>  [FAILED] Failed to start LSB: Bring up/down networking.
>          See 'systemctl status network.service' for details.

Ahh, so let's read that output:

> [root at alcove mhw]# systemctl status network.service
> network.service - LSB: Bring up/down networking

ok, just describes the service

>           Loaded: loaded (/etc/rc.d/init.d/network)

ok, it's loaded. it's a sysV init script. Maybe the script is old and
wasn't written for systemd?

>           Active: failed (Result: exit-code) since Wed, 24 Oct 2012 18:23:07 +0400; 1min 57s ago

hmm, EXIT CODE failure. It exited with a non-zero status.

>          Process: 91 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=209/STDOUT)

ok so, the script returned a strange error code.

>           CGroup: name=systemd:/system/lxc/Alcove/system/network.service

shrug, doesn't seem to matter.

...

Now, from just this I can conclude that your `/etc/rc.d/init.d/network
start` produced an error.

How is that useless information? It is exactly what the status of the
network service is - failed, with an error code.

Now, from what I remember newer versions of systemd produce a short
'tail' of the services' error log in case it fails, looking like this:

# systemctl status connman.service
connman.service - Connection service
	  Loaded: loaded (/etc/systemd/system/connman.service; disabled)
	  Active: inactive (dead)
	  CGroup: name=systemd:/system/connman.service

Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: Remove interface
wlan0 [ wifi ]
Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: Remove interface
eth1 [ ethernet ]
Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: eth0 {remove} index 2
Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: wlan0 {remove} index 3
Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: eth1 {remove} index 4
Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: Exit
Oct 19 10:44:01 htpc connmand[3168]: eth0 {remove} index 2
Oct 19 10:44:01 htpc connmand[3168]: wlan0 {remove} index 3
Oct 19 10:44:01 htpc connmand[3168]: eth1 {remove} index 4
Oct 19 10:44:01 htpc connmand[3168]: Exit

This should help. Obviously, journalctl should help you a lot as well.

Cheers,

Auke


More information about the systemd-devel mailing list