<div dir="ltr"><div>This is a followup to this thread:</div><div><br></div><div>
<a href="https://lists.freedesktop.org/archives/systemd-devel/2015-July/033585.html">https://lists.freedesktop.org/archives/systemd-devel/2015-July/033585.html</a></div><div><br></div><div>To see if there are any new developments.</div><div><br></div><div>We have multi-process application that already uses systemd successfully. Our customers want to put the application into a container and that container should be docker because that is what they use. We can't use systemd-nspawn or podman or whatever because our customers want to use docker because they are already using docker for other applications.</div><div><br></div><div>I understand that containers are not a security technology but we want to find a solution that allows us to run systemd in a docker container that isn't blatantly less secure than systemd running outside of a container. I have yet to find a way.</div><div><br></div><div>Fundamentally, the problem is that the systemd in the container require read/write access to the host's /sys/fs/cgroup/systemd directory in order to function at all. Even if the container isn't privileged, it's necessary to mount the host's /sys/fs/cgroup directory inside the directory and let the container write to it, you have a security hole that doesn't exist when systemd is just run on the host. That hole is described here:</div><div><br></div><div><a href="https://blog.trailofbits.com/2019/07/19/understanding-docker-container-escapes/">https://blog.trailofbits.com/2019/07/19/understanding-docker-container-escapes/</a></div><div><br></div><div>Using user namespaces doesn't help because then the container user wouldn't have permission to write to the /sys/fs/cgroup/systemd.</div><div><br></div><div>Our application runs as a non-root user. The security concern is that any user on the host who is in the docker group would be able to start a shell inside the container as "container root" and then be able to get root on the host. So basically membership in the docker group is equivalent to host root.</div><div><br></div><div>Taking a step back - I wonder (mostly asking Lennart) if there is a way to run systemd without it needing access to /sys/fs/cgroup/systemd? I'm sure there isn't but I thought I would ask.</div><div><br></div><div>Also, we actually use the systemd user service and only need to use a few of systemd's feature related to process management (ExecStart, ExecStop, restart behavior, kill behavior, env vars).</div><div><br></div><div>I don't care about 97% of systemd features (resource management, private whatevers). But the process management of systemd is the gold standard in my opinion and we already use it which is why I want to continue to use it inside of a container.</div><div><br></div><div>Is there a way to run systemd's user service without it having the system systemd service as a parent?</div><div><br></div><div>Like I said, I'm just looking for process management.</div><div><br></div><div>Any thoughts or ideas Lennart? Thanks!</div><div><br></div><div>Jeff<br></div></div>