[systemd-devel] Unable to run systemd in an LXC / cgroup container.

Michael H. Warfield mhw at WittsEnd.com
Sun Oct 21 14:25:05 PDT 2012


Hello,

This is being directed to the systemd-devel community but I'm cc'ing the
lxc-users community and the Fedora community on this for their input as
well.  I know it's not always good to cross post between multiple lists
but this is of interest to all three communities who may have valuable
input.

I'm new to this particular list, just having joined after tracking a
problem down to some systemd internals...

Several people over the last year or two on the lxc-users list have been
discussions trying to run certain distros (notably Fedora 16 and above,
recent Arch Linux and possibly others) in LXC containers, virualizing
entire servers this way.  This is very similar to Virtuoso / OpenVZ only
it's using the native Linux cgroups for the containers (primary reason I
dumped OpenVZ was to avoid their custom patched kernels).  These recent
distros have switched to systemd for the main init process and this has
proven to be disastrous for those of us using LXC and trying to install
or update our containers.

To put it bluntly, it doesn't work and causes all sorts of problems on
the host.

To summarize the problem...  The LXC startup binary sets up various
things for /dev and /dev/pts for the container to run properly and this
works perfectly fine for SystemV start-up scripts and/or Upstart.
Unfortunately, systemd has mounts of devtmpfs on /dev and devpts
on /dev/pts which then break things horribly.  This is because the
kernel currently lacks namespaces for devices and won't for some time to
come (in design).  When devtmpfs gets mounted over top of /dev in the
container, it then hijacks the hosts console tty and several other
devices which had been set up through bind mounts by LXC and should have
been LEFT ALONE.

Yes!  I recognize that this problem with devtmpfs and lack of namespaces
is a potential security problem anyways that could (and does) cause
serious container-to-host problems.  We're just not going to get that
fixed right away in the linux cgroups and namespaces.

How do we work around this problem in systemd where it has hard coded
mounts in the binary that we can't override or configure?  Or is it
there and I'm just missing it trying to examine the sources?  That's how
I found where the problem lay.

Regards,
Mike
-- 
Michael H. Warfield (AI4NB) | (770) 985-6132 |  mhw at WittsEnd.com
   /\/\|=mhw=|\/\/          | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9          | An optimist believes we live in the best of all
 PGP Key: 0x674627FF        | possible worlds.  A pessimist is sure of it!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: This is a digitally signed message part
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20121021/a5600a29/attachment.pgp>


More information about the systemd-devel mailing list