<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 10/27/19 11:50 PM, Jeff Solomon
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CANkaC9_wFD1kB6uy+=6_B=DqnHuh6jh29j7nYHROCyJ_HyP=Fw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>This is a followup to this thread:</div>
<div><br>
</div>
<div>
<a
href="https://lists.freedesktop.org/archives/systemd-devel/2015-July/033585.html"
moz-do-not-send="true">https://lists.freedesktop.org/archives/systemd-devel/2015-July/033585.html</a></div>
<div><br>
</div>
<div>To see if there are any new developments.</div>
<div><br>
</div>
<div>We have multi-process application that already uses systemd
successfully. Our customers want to put the application into a
container and that container should be docker because that is
what they use. We can't use systemd-nspawn or podman or
whatever because our customers want to use docker because they
are already using docker for other applications.</div>
<div><br>
</div>
<div>I understand that containers are not a security technology
but we want to find a solution that allows us to run systemd
in a docker container that isn't blatantly less secure than
systemd running outside of a container. I have yet to find a
way.</div>
<div><br>
</div>
<div>Fundamentally, the problem is that the systemd in the
container require read/write access to the host's
/sys/fs/cgroup/systemd directory in order to function at all.
Even if the container isn't privileged, it's necessary to
mount the host's /sys/fs/cgroup directory inside the directory
and let the container write to it, you have a security hole
that doesn't exist when systemd is just run on the host. That
hole is described here:</div>
<div><br>
</div>
<div><a
href="https://blog.trailofbits.com/2019/07/19/understanding-docker-container-escapes/"
moz-do-not-send="true">https://blog.trailofbits.com/2019/07/19/understanding-docker-container-escapes/</a></div>
<div><br>
</div>
<div>Using user namespaces doesn't help because then the
container user wouldn't have permission to write to the
/sys/fs/cgroup/systemd.</div>
<div><br>
</div>
<div>Our application runs as a non-root user. The security
concern is that any user on the host who is in the docker
group would be able to start a shell inside the container as
"container root" and then be able to get root on the host. So
basically membership in the docker group is equivalent to host
root.</div>
<div><br>
</div>
<div>Taking a step back - I wonder (mostly asking Lennart) if
there is a way to run systemd without it needing access to
/sys/fs/cgroup/systemd? I'm sure there isn't but I thought I
would ask.</div>
<div><br>
</div>
<div>Also, we actually use the systemd user service and only
need to use a few of systemd's feature related to process
management (ExecStart, ExecStop, restart behavior, kill
behavior, env vars).</div>
<div><br>
</div>
<div>I don't care about 97% of systemd features (resource
management, private whatevers). But the process management of
systemd is the gold standard in my opinion and we already use
it which is why I want to continue to use it inside of a
container.</div>
<div><br>
<br>
</div>
<div>Is there a way to run systemd's user service without it
having the system systemd service as a parent?</div>
<div><br>
</div>
<div>Like I said, I'm just looking for process management.</div>
<div><br>
</div>
<div>Any thoughts or ideas Lennart? Thanks!</div>
<div><br>
</div>
<div>Jeff<br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
systemd-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:systemd-devel@lists.freedesktop.org">systemd-devel@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/systemd-devel">https://lists.freedesktop.org/mailman/listinfo/systemd-devel</a></pre>
</blockquote>
<p>We just merged a patch into Podman to attempt to block this
breakout.</p>
<p><a class="moz-txt-link-freetext" href="https://github.com/containers/libpod/pull/4345">https://github.com/containers/libpod/pull/4345</a></p>
<p><br>
</p>
<p>Would mounting /dev/null read/only on<span class="js-issue-title">/sys/fs/cgroup/systemd/release_agent
block the escape?<br>
</span></p>
</body>
</html>