[systemd-devel] systemd as Docker process manager (was: Docker, Supervisor and systemd)

Paul Menzel paulepanter at users.sourceforge.net
Sun Jul 19 13:38:06 PDT 2015


Dear Lennart,


a late thank you for your reply!


Am Freitag, den 21.02.2014, 19:39 +0100 schrieb Lennart Poettering:
> On Thu, 20.02.14 23:25, Paul Menzel wrote:
> > 
> > Docker, “an open-source project to easily create lightweight, 
> > portable, self-sufficient containers from any application”, [1] 
> > mostly recommends to use Supervisor [2] to control the processes to 
> > be run in the container, like starting and restarting them and 
> > logging the output.
> > Actually all things systemd also does to my knowledge. Supervisor 
> > also needs a configuration file for each process, which it should 
> > start.
> > 
> > Has somebody experiences to use systemd for that? Or is there a 
> > reason why systemd should not be used for that?
> 
> systemd should work fine for that. I figure systemd is not yet
> everywhere hence they suggest an option you can install everywhere...
> 
> I had a look at the configuration file language of supervisord. THere
> appears to be nothing interesting we couldn't do already. I mean, 
> there are certain differences, for example they have an XMLRPC API, 
> while ours is via D-Bus, but other than that I don't see much... They 
> have some fcgi hookup, but I don't grok that, and I figure we already 
> can do kinda the same with socket activation, but dunno...

Finally, I looked into this topic again to use systemd as the process
manager for a Docker container. I found Dan Walsh’s great post *Running
systemd within a docker container*, written two month after my message
to the systemd-devel list, talking about some bugs and other issues,
where most of them are solved now with at least Docker 1.6, so systemd
can be used without much problems as the process manager.

To my knowledge there are two issue left. I enumerate these below and
add two more general questions.

1. The capability `SYS_ADMIN` has to be given(?) to the Docker
container.

    --cap-add SYS_ADMIN

Otherwise things like `systemctl` do not work and fail with the message
that D-Bus is not available.

Reading the manual page *capabilities* (`man 7 capabilities` [1]) shows
quite an excessive list.

       CAP_SYS_ADMIN
              * Perform a range of system administration operations
                including: quotactl(2), mount(2), umount(2), swapon(2),
                setdomainname(2);
              * perform privileged syslog(2) operations (since Linux 2.6.37,
                CAP_SYSLOG should be used to permit such operations);
              * perform VM86_REQUEST_IRQ vm86(2) command;
              * perform IPC_SET and IPC_RMID operations on arbitrary System
                V IPC objects;
              * override RLIMIT_NPROC resource limit;
              * perform operations on trusted and security Extended
                Attributes (see xattr(7));
              * use lookup_dcookie(2);
              * use ioprio_set(2) to assign IOPRIO_CLASS_RT and (before
                Linux 2.6.25) IOPRIO_CLASS_IDLE I/O scheduling classes;
              * forge PID when passing socket credentials via UNIX domain
                sockets;
              * exceed /proc/sys/fs/file-max, the system-wide limit on the
                number of open files, in system calls that open files (e.g.,
                accept(2), execve(2), open(2), pipe(2));
              * employ CLONE_* flags that create new namespaces with
                clone(2) and unshare(2) (but, since Linux 3.8, creating user
                namespaces does not require any capability);
              * call perf_event_open(2);
              * access privileged perf event information;
              * call setns(2) (requires CAP_SYS_ADMIN in the target
                namespace);
              * call fanotify_init(2);
              * perform KEYCTL_CHOWN and KEYCTL_SETPERM keyctl(2)
                operations;
              * perform madvise(2) MADV_HWPOISON operation;
              * employ the TIOCSTI ioctl(2) to insert characters into the
                input queue of a terminal other than the caller's
                controlling terminal;
              * employ the obsolete nfsservctl(2) system call;
              * employ the obsolete bdflush(2) system call;
              * perform various privileged block-device ioctl(2) operations;
              * perform various privileged filesystem ioctl(2) operations;
              * perform administrative operations on many device drivers.

So a container with that capability won’t be that contained anymore.

Do you know of a way to run systemd within the container without adding
the capability SYS_ADMIN?

2. systemd-docker [4][7]

Quoting the systemd-docker README.md [4]:

> Why I wrote this?
>
> The full context is in Docker Issue #6791 [5] and this mailing list
> thread [6]. The short of it is that systemd does not actually
> supervise the Docker container but instead the Docker client. This
> makes systemd incapable of reliably managing Docker containers
> without hitting a bunch of really odd situations.

Is it planned to solve this in systemd somehow or is it a Docker issue
from systemd’s standpoint?

3. There must also have been some “container improvements” between
systemd 215 and 221. At least I think, that running systemd 215 in the
container with Debian Jessie/stable it starts unnecessary(?) processes
like systemd-udev, while that doesn’t happen when Debian
Stretch/testing with systemd 221 [8] is used. `systemctl status` and
`ps aux` just show the journal and the configured programs like Cron
and Rsyslog.

Is that just equivalent of deleting the “wants targets”, Dan talked
about in his blog post [1][2], or some big code changes? Would that be
easily “backportable” to systemd 215?

4. Current developments

Is there an overview of the latest developments in this regard, like
logging and other things? So many things are created and developed,
that I might have missed something.

Like some parser of Docker Compose configuration files creating systemd
unit files from it?

Or should Docker be avoided anyway, because systemd-nspawn or something
similar does something similar in a much easier fashion?


Thanks,

Paul


[1] https://developerblog.redhat.com/2014/05/05/running-systemd-within-docker-container
[2] https://rhatdan.wordpress.com/2014/04/30/running-systemd-within-a-docker-container
[3] http://man7.org/linux/man-pages/man7/capabilities.7.html
[4] http://container-solutions.com/running-docker-containers-with-systemd/
[5] https://github.com/docker/docker/issues/6791
[6] https://groups.google.com/d/topic/coreos-dev/wf7G6rA7Bf4/discussion
[7] https://github.com/ibuildthecloud/systemd-docker
[8] https:https://docs.docker.com/compose/https://docs.docker.com/compose///packages.debian.org/search?keywords=systemd
[9] https://docs.docker.com/compose/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20150719/758bc132/attachment.sig>


More information about the systemd-devel mailing list