[systemd-devel] Howto run systemd within a linux container

Richard Weinberger richard.weinberger at gmail.com
Wed Feb 5 17:08:05 PST 2014


On Thu, Feb 6, 2014 at 1:08 AM, Kay Sievers <kay at vrfy.org> wrote:
> On Thu, Feb 6, 2014 at 12:56 AM, Lennart Poettering
> <lennart at poettering.net> wrote:
>> On Wed, 05.02.14 23:44, Richard Weinberger (richard.weinberger at gmail.com) wrote:
>
>>> We're heavily using Linux containers in our production environment.
>>> As modern Linux distributions move forward to systemd have to make sure that
>>> systemd works within our containers.
>>>
>>> Sadly we're facing issues with cgroups.
>>> Our testbed consists of openSUSE 13.1 with Linux 3.13.1 and libvirt 1.2.1.
>>>
>>> In a plain setup systemd stops immediately because it is unable to
>>> create the cgroup hierarchy.
>>> Mostly because the container uid 0 is in a user namespace and has no
>>> rights to do that.
>>
>> Make sure to either make the name=systemd cgroups hierarchy available in
>> the container, or to grant it CAP_SYS_MOUNT so that it can do it on its
>> own.
>>
>> Make sure that your container manager sets up thigns like described here:
>>
>> http://www.freedesktop.org/wiki/Software/systemd/ContainerInterface/
>>
>>> Next try, trigger the "Ingo Molnar"-branch by mounting a tmpfs to
>>> /sys/fs/cgroup/, systemd segfaults.
>>> Bug filed to https://bugs.freedesktop.org/show_bug.cgi?id=74589
>>
>> Yeah, this is never tested, and likely to break all the time. We
>> probably should remove this feature, since we cannot guarantee it work,
>> and apparently nobody has noticed it to be broken since a while.
>
> Yeah, we should remove it now. We will never really be able to support
> that, init=/bin/sh is probably the better option than a systemd going
> crazy or crashing.
>
>>>          Starting Create dynamic rule for /dev/root link...
>>
>>    This is so bogus that it hurts ^^^^^^^
>
> Seems some distros cannot let bad ideas die. :)
>
>>> But is this tmpfs hack the correct way to run systemd in a container?
>>> I really don't think so.
>>
>> Nope. Please mount tmpfs to /sys/fs/cgroup as tmps, and then the
>> name=systemd cgroup hierarchy to /sys/fs/cgroup/systemd, see above.
>
> User namespaces are involved and uid 0 is mapped to an ordinary user.
> Never tried, but it might be needed that the subtree in the container
> is chown()ed to the mapped user.

As discussed on IRC, I'll try that tomorrow. :-)

-- 
Thanks,
//richard


More information about the systemd-devel mailing list