[systemd-devel] I want to run systemd inside of a locked down base docker container

Lennart Poettering lennart at poettering.net
Wed Feb 10 23:21:33 CET 2016


On Wed, 10.02.16 16:43, Daniel J Walsh (dwalsh at redhat.com) wrote:

> >>> I don't see why one would want to mask systemd-logind.service. If you
> >>> permit logins and PAM at all, you really need that. 
> >> If I wanted to add a login program I could enable/unmask these.
> >> No one runs docker containers as login services, that would require
> >> getty. 
> > Well, "machinectl shell", "cron" and all those things do PAM... In
> > fact the fact that "machinectl shell" goes through PAM and registers
> > with logind through that is one of the major benefits over naked
> > "nsenter".
>
> I wonder if any of these work correctly inside of a docker container?
> 
> Can these be customized or do they require systemd as pid 1 inside of
> the container.  Docker has a "docker exec"

No, "machinectl shell" requires PID 1 in the container to be systemd.

Unlike "nsenter" (and docker exec, as I presume), "machinectl shell"
will not try to take a process from the host and patch around in its
process attributes until it appears to be a process from within the
container (by joining namespaces, cgroups, uids, gids, selinux labels,
audit creds, …). It will instead allocate a pty in the container and
then use ask systemd inside the container using the "transient units"
API to spawn a shell on it. It then does nothing else than forward
data between this pty and the tty it was invoked from. 

This way the only processes you see in the container have actually
been started by systemd inside the container. They are properly
tracked and maintained like any other process invoked in the container
by the systemd instance that is running it. They inherit the process
attributes from PID 1 in the container, and the PPID reported for them
will actually be 1 as it should. – They are not these weird alien
processes like nsenter creates that are half-way part of the host
system and half-way member of the container, for which PPID returns 0
in the container, because they actually don't have a parent process
inside of the container.

Long story short: "machinectl shell" should work fine even with docker
containers – as long as systemd runs as PID 1 in them.

> > I added this to the TODO list now.
>
> Sounds fine with me.  I went back to the original container and I can
> remove all of the other modifications, I can live with the warnings at the
> beginning and remove the /etc/fstab.  We just need to get this into more
> people hands to see what happens and what breaks. 

Quite frankly, I don't understand why /etc/fstab is populated at all
on Fedora by default. It should only exists if there are actual
external file systems configured in it, and otherwise just not exist.

> This is what I am seeing now with just /etc/fstab removed.
> 
> Welcome to Fedora 23 (Twenty Three)!
> 
> Set hostname to <654f7872d331>.
> dev-hugepages.mount: Cannot add dependency job, ignoring: Unit dev-hugepages.mount is masked.
> sys-fs-fuse-connections.mount: Cannot add dependency job, ignoring: Unit sys-fs-fuse-connections.mount is masked.
> systemd-remount-fs.service: Cannot add dependency job, ignoring: Unit systemd-remount-fs.service is masked.
> systemd-logind.service: Cannot add dependency job, ignoring: Unit systemd-logind.service is masked.
> getty.target: Cannot add dependency job, ignoring: Unit getty.target
> is masked.

Again, there should be no need to mask dev-hugepages.mount and
getty.target at all. And if you drop /etc/fstab there's no need to
mask systemd-remount-fs.target either. Please unmask those three
units!

As soon as my patch to add the ConditionCapability= check to
sys-fs-fuse-connections.mount you should also be able to unmask that
unit and get a clean boot.

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list