[systemd-devel] systemd as a docker process manager

Daniel Walsh dwalsh at redhat.com
Mon Oct 28 13:19:58 UTC 2019





On 10/27/19 11:50 PM, Jeff Solomon wrote:
> This is a followup to this thread:
>
> https://lists.freedesktop.org/archives/systemd-devel/2015-July/033585.html
>
> To see if there are any new developments.
>
> We have multi-process application that already uses systemd
> successfully. Our customers want to put the application into a
> container and that container should be docker because that is what
> they use. We can't use systemd-nspawn or podman or whatever because
> our customers want to use docker because they are already using docker
> for other applications.
>
> I understand that containers are not a security technology but we want
> to find a solution that allows us to run systemd in a docker container
> that isn't blatantly less secure than systemd running outside of a
> container. I have yet to find a way.
>
> Fundamentally, the problem is that the systemd in the container
> require read/write access to the host's /sys/fs/cgroup/systemd
> directory in order to function at all. Even if the container isn't
> privileged, it's necessary to mount the host's /sys/fs/cgroup
> directory inside the directory and let the container write to it, you
> have a security hole that doesn't exist when systemd is just run on
> the host. That hole is described here:
>
> https://blog.trailofbits.com/2019/07/19/understanding-docker-container-escapes/
>
> Using user namespaces doesn't help because then the container user
> wouldn't have permission to write to the /sys/fs/cgroup/systemd.
>
> Our application runs as a non-root user. The security concern is that
> any user on the host who is in the docker group would be able to start
> a shell inside the container as "container root" and then be able to
> get root on the host. So basically membership in the docker group is
> equivalent to host root.
>
> Taking a step back - I wonder (mostly asking Lennart) if there is a
> way to run systemd without it needing access to
> /sys/fs/cgroup/systemd? I'm sure there isn't but I thought I would ask.
>
> Also, we actually use the systemd user service and only need to use a
> few of systemd's feature related to process management (ExecStart,
> ExecStop, restart behavior, kill behavior, env vars).
>
> I don't care about 97% of systemd features (resource management,
> private whatevers). But the process management of systemd is the gold
> standard in my opinion and we already use it which is why I want to
> continue to use it inside of a container.
>
>
> Is there a way to run systemd's user service without it having the
> system systemd service as a parent?
>
> Like I said, I'm just looking for process management.
>
> Any thoughts or ideas Lennart? Thanks!
>
> Jeff
>
> _______________________________________________
> systemd-devel mailing list
> systemd-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel

We just merged a patch into Podman to attempt to block this breakout.

https://github.com/containers/libpod/pull/4345


Would mounting /dev/null read/only
on/sys/fs/cgroup/systemd/release_agent block the escape?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20191028/c5f6c704/attachment.html>


More information about the systemd-devel mailing list