[systemd-devel] Running systemd unprivileged in Docker container

Paul Menzel pmenzel+systemd-devel at molgen.mpg.de
Sat Jun 12 06:16:28 UTC 2021


Dear Johannes,


Am 12.06.21 um 01:55 schrieb Johannes Ernst:
> I can run a full Arch system (with systemd as PID 1) in a Docker container in Docker privileged mode:
>      sudo docker run -i -t --privileged archlinux /usr/lib/systemd/systemd
> but privileged mode is, well, a bit privileged. I believe used to be able to tone this down with something like:
> 
>      sudo docker run -i -t --cap-add=ALL -v /sys/fs/cgroup:/sys/fs/cgroup:ro archlinux /usr/lib/systemd/systemd
> or even less capabilities than "all". But now I'm getting:
> 
>      systemd 248.3-2-arch running in system mode. (+PAM +AUDIT -SELINUX -APPARMOR -IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +XKBCOMMON +UTMP -SYSVINIT default-hierarchy=unified)
>      Detected virtualization docker.
>      Detected architecture x86-64.
>      Detected first boot.
> 
>      Welcome to Arch Linux!
> 
>      Initializing machine ID from random generator.
>      Failed to create /init.scope control group: Read-only file system
>      Failed to allocate manager object: Read-only file system
>      [!!!!!!] Failed to allocate manager object.
>      Exiting PID 1...
> I don't understand what that means. (Somebody likes exclamation marks.) What's the "manager object", and who is trying to allocate it?
> 
> Assuming that the "Read-only filesystem" in question is that /sys/fs/cgroup, when binding it into the container as read-write I get that instead:
> 
>      Failed to create /init.scope control group: No such file or directory
>      Failed to allocate manager object: No such file or directory
> This long Serverfault thread <https://serverfault.com/questions/1053187/systemd-fails-to-run-in-a-docker-container-when-using-cgroupv2-cgroupns-priva> may be related? Are they saying it's broken? Can it be done?
> 
> Posted this earlier <https://bbs.archlinux.org/viewforum.php?id=23> in the Arch forum, lots of views, no answers.

There are some issues in the systemd issue tracker like *systemd 248 
broke read-only /sys/fs/cgroup mount in docker #19245* [1]. Do they 
apply to your problem?

Also, unprivileged Docker environment with systemd inside does not work 
well. The systemd folks say, that Docker needs to implement the 
container interface [2][3], which it does not do. Other container 
manager like Podman [4] do that.


Kind regards,

Paul


[1]: https://github.com/systemd/systemd/issues/19245
[2]: https://github.com/systemd/systemd/issues/17320
[3]: https://systemd.io/CONTAINER_INTERFACE
[4]: https://podman.io/


More information about the systemd-devel mailing list