[systemd-devel] Error handling problems with systemd-machined

Lennart Poettering lennart at poettering.net
Wed Jul 24 08:59:48 PDT 2013


On Wed, 24.07.13 14:13, Daniel P. Berrange (berrange at redhat.com) wrote:

> I'm working on integrating libvirt with systemd-machined for cgroups
> setup and hitting a number of problems
> 
> The first was that v205 ignores all parameters passed though as scope
> properties in the DBus CreateMachine call. So I upgraded to v206 which
> seems to have fixed that.

Yes, this is fixed in 206, please only use that, the array was indeed
completely ignored in 205.

BTW, I have add this a few days ago:

http://www.freedesktop.org/wiki/Software/systemd/machined/

and this:

http://www.freedesktop.org/wiki/Software/systemd/writing-vm-managers/

Still missing is documentation for the bus api to create
scopes/transient units and changing their params.

> 
> When something goes wrong with the CreateMachine DBus call though all I
> ever seem to get back is  "Input/output error".
> 
> After strace'ing systemd-machined I find the real error
> 
> recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{"l\1\0\1\334\0\0\0\2\0\0\0\277\0\0\0\1\1o\0\31\0\0\0/org/freedesktop/machine1\0\0\0\0\0\0\0\6\1s\0\30\0\0\0org.freedesktop.machine1\0\0\0\0\0\0\0\0\2\1s\0 \0\0\0org.freedesktop.machine1.Manager\0\0\0\0\0\0\0\0\3\1s\0\r\0\0\0CreateMachine\0\0\0\10\1g\0\fsayssusa(sv)\0\0\0\0\0\0\0\7\1s\0\6\0\0\0:1.130\0\0\t\0\0\0lxc-busy2\0\0\0\20\0\0\0\335\247\271G\10F\27Y(s\0177]\367\327\353\v\0\0\0libvirt-lxc\0\t\0\0\0container\0\0\0\210:\0\0\0\0\0\0\0\0\0\0\204\0\0\0\0\0\0\0\5\0\0\0Slice\0\1s\0\0\0\0\16\0\0\0/machine.slice\0\0\0\0\0\0\r\0\0\0CPUAccounting\0\1b\0\0\0\0\1\0\0\0\0\0\0\0\21\0\0\0BlockIOAccounting\0\1b\0\0\0\0\1\0\0\0\20\0\0\0MemoryAccounting\0\1b\0\1\0\0\0", 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 428
> sendmsg(5, {msg_name(0)=NULL, msg_iov(2)=[{"l\1\0\1D\1\0\0\n\0\0\0\255\0\0\0\1\1o\0\31\0\0\0/org/freedesktop/systemd1\0\0\0\0\0\0\0\6\1s\0\30\0\0\0org.freedesktop.systemd1\0\0\0\0\0\0\0\0\2\1s\0 \0\0\0org.freedesktop.systemd1.Manager\0\0\0\0\0\0\0\0\3\1s\0\22\0\0\0StartTransientUnit\0\0\0\0\0\0\10\1g\0\7ssa(sv)\0\0\0\0", 192}, {"\32\0\0\0machine-lxc\\x2dbusy2.scope\0\0\4\0\0\0fail\0\0\0\0\24\1\0\0\5\0\0\0Slice\0\1s\0\0\0\0\r\0\0\0machine.slice\0\0\0\0\0\0\0\v\0\0\0Description\0\1s\0\0\23\0\0\0Container lxc-busy2\0\0\0\0\0\17\0\0\0TimeoutStopUSec\0\1t\0\0 \241\7\0\0\0\0\0\4\0\0\0PIDs\0\2au\0\0\0\0\4\0\0\0\210:\0\0\5\0\0\0Slice\0\1s\0\0\0\0\16\0\0\0/machine.slice\0\0\0\0\0\0\r\0\0\0CPUAccounting\0\1b\0\0\0\0\1\0\0\0\0\0\0\0\21\0\0\0BlockIOAccounting\0\1b\0\0\0\0\1\0\0\0\20\0\0\0MemoryAccounting\0\1b\0\1\0\0\0", 324}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 516
> recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{"l\3\1\1+\0\0\0\265\1\0\0]\0\0\0\6\1s\0\6\0\0\0:1.126\0\0\4\1s\0$\0\0\0org.freedesktop.systemd1.InvalidName\0\0\0\0\5\1u\0\n\0\0\0\10\1g\0\1s\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0&\0\0\0Unit name /machine.slice is not valid.\0", 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 155
> sendmsg(3, {msg_name(0)=NULL, msg_iov(4)=[{"PRIORITY=3\nSYSLOG_FACILITY=4\nCODE_FILE=src/machine/machine.c\nCODE_LINE=246\nCODE_FUNCTION=machine_start_scope\nSYSLOG_IDENTIFIER=systemd-machined\n", 144}, {"MESSAGE=", 8}, {"Failed to start machine scope: Unit name /machine.slice is not valid.", 69}, {"\n", 1}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 222
> sendmsg(5, {msg_name(0)=NULL, msg_iov(2)=[{"l\3\1\1\27\0\0\0\v\0\0\0O\0\0\0\6\1s\0\6\0\0\0:1.130\0\0\4\1s\0\"\0\0\0org.freedesktop.DBus.Error.IOError\0\0\0\0\0\0\5\1u\0\2\0\0\0\10\1g\0\1s\0\0", 96}, {"\22\0\0\0Input/output error\0", 23}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 119
> 
> 
> So machined is getting a useful error back from systemd
> 
>   "Unit name /machine.slice is not valid."
> 
> and syslog'ing that error, and then sending back the dbus client a useless
> "Input/output error" message :-(

Yeah, we really suck at handing out good errors. But usually should
should have gotten an (equally useless) EINVAL in most cases.

> Once I fixed the unit name to removing the leading '/', I hit a second
> error
> 
> recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{"l\3\1\0014\0\0\0\301\1\0\0]\0\0\0\6\1s\0\6\0\0\0:1.126\0\0\4\1s\0#\0\0\0org.freedesktop.systemd1.UnitExists\0\0\0\0\0\5\1u\0\f\0\0\0\10\1g\0\1s\0\0\7\1s\0\4\0\0\0:1.1\0\0\0\0/\0\0\0Unit machine-lxc\\x2dbusy2.scope already exists.\0", 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 164
> 
>   "Unit machine-lxc\\x2dbusy2.scope already exists"
> 
> But neither "machinectl list" or "systemctl --full" show any such machine
> or unit existing. It seems like when it reported the bogus slice name,
> it did not fully clean up the transient scope unit it created. This is
> then blocking further attempts to create the same transient scope.

Hmm, that's interesting. What does "systemctl status" say for the unit
in question when this happens? Could you paste?

Kay had some issues where the kernel's release_agent wouldn't be called
on recent kernels, but I never had issues with that...

Lennart

-- 
Lennart Poettering - Red Hat, Inc.


More information about the systemd-devel mailing list