[systemd-devel] One of my fundamental problems with systemd...
Michael H. Warfield
mhw at WittsEnd.com
Fri Oct 26 17:06:18 PDT 2012
On Fri, 2012-10-26 at 16:51 -0700, Kok, Auke-jan H wrote:
> On Fri, Oct 26, 2012 at 3:39 PM, Michael H. Warfield <mhw at wittsend.com> wrote:
> > My most fundamental problem with systemd is its insistence in hiding and
> > obfuscating errors in ways that makes debugging almost impossible.
> > Almost every upgrade problem I've had in Fedora has been related to
> > systemd's failure to provide comprehendable error messages to things
> > like errors in fstab (#1 fsck up).
> >
> > The most recent problem has been an issue trying to get LXC containers
> > to work. The networking is not coming up in the container at boot.
> > It's not starting. What do I get? I finally dug it out of the console
> > barf. A message that says this:
> >
> > [FAILED] Failed to start LSB: Bring up/down networking.
> > See 'systemctl status network.service' for details.
> >
> > Ok fine... So I get logged in and I run this...
> >
> > [root at alcove mhw]# systemctl status network.service
> > network.service - LSB: Bring up/down networking
> > Loaded: loaded (/etc/rc.d/init.d/network)
> > Active: failed (Result: exit-code) since Wed, 24 Oct 2012 18:23:07 +0400; 1min 57s ago
> > Process: 91 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=209/STDOUT)
> > CGroup: name=systemd:/system/lxc/Alcove/system/network.service
> >
> > Tells me nothing. Does not tell me where the problem is...
>
> On the contrary, it does quite clearly indicate where the problem is.
>
> The information may be represented differently than you expect, but it
> is clear on the status of the service - including the return value of
> the execution.
>
> > If I then try to manually start the network I get this...
> >
> > [root at alcove mhw]# service network start
> > Starting network (via systemctl): network[275]: Bringing up loopback interface: ./network-functions: line 237: cd: /var/run/netreport: No such file or directory
> > network[275]: [ OK ]
> > network[275]: Bringing up interface eth0: ./network-functions: line 237: cd: /var/run/netreport: No such file or directory
> > network[275]: [ OK ]
> > network[275]: touch: cannot touch `/var/lock/subsys/network': No such file or directory
> > [ OK ]
> > OK... This I can understand. There are missing directories in /var/run
> > and in /var/lock. Don't tell me how that script should have done this
> > or that or the other. That's NOT the problem. The problem is that
> > systemd did not communicate back WHAT THE REAL PROBLEM WAS. Why is it
> > so difficult for systemd to respond with intelligent error message???
> > The message said to run "systemctl status network.service" but that
> > result was worthless.
> >
> > I'll now edit that startup script to fix this nonsense but it's pointing
> > to a fundamental failure in systemd in communicating errors back to
> > administrators in an actionable manner.
>
> Try and be reasonable here. Here's what I read in your message:
>
> > [FAILED] Failed to start LSB: Bring up/down networking.
> > See 'systemctl status network.service' for details.
>
> Ahh, so let's read that output:
>
> > [root at alcove mhw]# systemctl status network.service
> > network.service - LSB: Bring up/down networking
>
> ok, just describes the service
>
> > Loaded: loaded (/etc/rc.d/init.d/network)
>
> ok, it's loaded. it's a sysV init script. Maybe the script is old and
> wasn't written for systemd?
>
> > Active: failed (Result: exit-code) since Wed, 24 Oct 2012 18:23:07 +0400; 1min 57s ago
>
> hmm, EXIT CODE failure. It exited with a non-zero status.
>
> > Process: 91 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=209/STDOUT)
>
> ok so, the script returned a strange error code.
>
> > CGroup: name=systemd:/system/lxc/Alcove/system/network.service
>
> shrug, doesn't seem to matter.
>
> ...
>
> Now, from just this I can conclude that your `/etc/rc.d/init.d/network
> start` produced an error.
>
> How is that useless information? It is exactly what the status of the
> network service is - failed, with an error code.
>
> Now, from what I remember newer versions of systemd produce a short
> 'tail' of the services' error log in case it fails, looking like this:
>
> # systemctl status connman.service
> connman.service - Connection service
> Loaded: loaded (/etc/systemd/system/connman.service; disabled)
> Active: inactive (dead)
> CGroup: name=systemd:/system/connman.service
>
> Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: Remove interface
> wlan0 [ wifi ]
> Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: Remove interface
> eth1 [ ethernet ]
> Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: eth0 {remove} index 2
> Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: wlan0 {remove} index 3
> Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: eth1 {remove} index 4
> Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: Exit
> Oct 19 10:44:01 htpc connmand[3168]: eth0 {remove} index 2
> Oct 19 10:44:01 htpc connmand[3168]: wlan0 {remove} index 3
> Oct 19 10:44:01 htpc connmand[3168]: eth1 {remove} index 4
> Oct 19 10:44:01 htpc connmand[3168]: Exit
> This should help. Obviously, journalctl should help you a lot as well.
The point is that you use to not have to be a diamond miner in a coal
mine to find this crap. When systemd drops you in emergency mode and
gives you no bloody clue why you ended up there it's a massive
frustration that I have seen far too often at this point.
Of course the network script was not written for systemd it was written
for sysvinit. NetworkMangler is a plague on servers that continues to
be a major PITA (No bridging? How the hell are we suppose to set this
stuff up?). You can't just whip this stuff up and then complain, well
it wasn't written to our standard. It's the support for those scripts
that's lacking, not a lacking in those scripts.
> Cheers,
> Auke
Regards,
Mike
--
Michael H. Warfield (AI4NB) | (770) 985-6132 | mhw at WittsEnd.com
/\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/
NIC whois: MHW9 | An optimist believes we live in the best of all
PGP Key: 0x674627FF | possible worlds. A pessimist is sure of it!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: This is a digitally signed message part
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20121026/d2689093/attachment-0001.pgp>
More information about the systemd-devel
mailing list