<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<br>
<div class="moz-cite-prefix">On 02/10/2016 04:27 PM, Lennart
Poettering wrote:<br>
</div>
<blockquote cite="mid:20160210212749.GA18538@gardel-login"
type="cite">
<pre wrap="">On Wed, 10.02.16 15:58, Daniel J Walsh (<a class="moz-txt-link-abbreviated" href="mailto:dwalsh@redhat.com">dwalsh@redhat.com</a>) wrote:
</pre>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<pre wrap=""> sed -i 's/^enable/disable/g' /lib/systemd/system-preset/*
</pre>
</blockquote>
<pre wrap="">Why would this matter?
</pre>
</blockquote>
<pre wrap="">We don't want excess services running inside of a docker container. I
only want systemd/journald and any services
that I enable in the container. Not something pulled in because the
installer thinks this is a VM or a Host OS.
</pre>
</blockquote>
<pre wrap="">Well, the default preset policy in Fedora is to disable everything by
default, modulo a few exceptions. Hence it should be unnecessary to
change anything with the default preset policy, unless you actually
want to *enable* rather than disable more by default...
</pre>
</blockquote>
<pre wrap="">
Here is what I see enabled in the base container. I don't think we
want any of this stuff running by default in a docker container.
</pre>
</blockquote>
<pre wrap="">
[…]
Well, but pretty much all the units you listed here are units from
RPMs you wouldn't install in a container anyway, aren't they? This,
they shouldn't matter anyway, and I'd argue they should be enabled by
default in a container too – if they are installed explicitly by the
user, through RPM. Hence, I think patching the preset stuff is not
necessary at all.
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">I don't see why one would want to mask systemd-logind.service. If you
permit logins and PAM at all, you really need that.
</pre>
</blockquote>
<pre wrap="">
If I wanted to add a login program I could enable/unmask these.
No one runs docker containers as login services, that would require
getty.
</pre>
</blockquote>
<pre wrap="">
Well, "machinectl shell", "cron" and all those things do PAM... In
fact the fact that "machinectl shell" goes through PAM and registers
with logind through that is one of the major benefits over naked
"nsenter".
</pre>
</blockquote>
I wonder if any of these work correctly inside of a docker
container?<br>
<br>
Can these be customized or do they require systemd as pid 1 inside
of the container. Docker has a "docker exec"<br>
command which does the correct thing, puts the command inside of the
containers Namespaces, cgroup, SELinux label, Capabilties ...<br>
<blockquote cite="mid:20160210212749.GA18538@gardel-login"
type="cite">
<pre wrap="">
I can see that you don't want to run it by default, but maybe we can
rearrange things so that logind is started on first use (i.e. on the
first PAM conversation). That way logind would normally not run in a
container, until it is actually requested by PAM conversation. We
could even add exit-on-idle so that it goes away after a while when
the user logs out again.
That way logind could stay available but would normally not appear in
"ps" unless it is actually used.
I added this to the TODO list now.
</pre>
</blockquote>
Sounds fine with me. I went back to the original container and I
can <br>
remove all of the other modifications, I can live with the warnings
at the <br>
beginning and remove the /etc/fstab. We just need to get this into
more<br>
people hands to see what happens and what breaks. <br>
<br>
As far as Hugepages is concerned, it seems there is some discussion
on it here<br>
<br>
<a class="moz-txt-link-freetext" href="https://bugzilla.redhat.com/show_bug.cgi?id=1199164">https://bugzilla.redhat.com/show_bug.cgi?id=1199164</a><br>
<blockquote cite="mid:20160210212749.GA18538@gardel-login"
type="cite">
<pre wrap="">
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">And masking the getty stuff appears to be entirely unnecessary...
</pre>
</blockquote>
<pre wrap="">Again the goal is just to get rid of the getty failure message at
bootup.
</pre>
</blockquote>
<pre wrap="">
But there should really be none with current systemd, as you don't
have /dev/tty0 and the getty unit has ConditionPathExists=/dev/tty0.
How precisely does the getty message look like that you get?
</pre>
</blockquote>
<br>
This is what I am seeing now with just /etc/fstab removed.<br>
<pre>Welcome to <font color="#3465A4">Fedora 23 (Twenty Three)</font>!
Set hostname to <654f7872d331>.
dev-hugepages.mount: Cannot add dependency job, ignoring: Unit dev-hugepages.mount is masked.
sys-fs-fuse-connections.mount: Cannot add dependency job, ignoring: Unit sys-fs-fuse-connections.mount is masked.
systemd-remount-fs.service: Cannot add dependency job, ignoring: Unit systemd-remount-fs.service is masked.
systemd-logind.service: Cannot add dependency job, ignoring: Unit systemd-logind.service is masked.
getty.target: Cannot add dependency job, ignoring: Unit getty.target is masked.
</pre>
<br>
<blockquote cite="mid:20160210212749.GA18538@gardel-login"
type="cite">
<pre wrap="">
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">Which leaves the /dev/hugepages and /sys/fs/fuse/connections
mounts. Note sure about those. Are you running the container with
CAP_SYS_ADMIN? If so, then there's no reason to mask those units. If
not, then I figure we could add checks that these are conditioned out
if CAP_SYS_ADMIN is missing.
</pre>
</blockquote>
<pre wrap="">
No docker containers do not enable SYS_ADMIN or NET_ADMIN by
default.
</pre>
</blockquote>
<pre wrap="">
I'll add a ConditionCapability=CAP_SYS_ADMIN line to the fuse
mount. The hugepages mount already has one (since 218).
With that addition there should really be no reason to mask out either
of the units explicitly, systemd should already silently skip them in
a docker setup where CAP_SYS_ADMIN is missing.
</pre>
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">On nspawn these two aren't seen since nspawn actually doesn't mount
the real sysfs to /sys, but just a tmpfs with a select number of
subdirectories from the real sysfs for security reasons. One of the
subdirs that are suppressed is /sys/fs. Now,
sys-fs-fuse-connections.mount is conditionalized on
/sys/fs/fuse/connections existing, hence if it is not there, then it
won't be mounted. And /dev/hugepages we simply allow to be mounted in
the container.
</pre>
</blockquote>
<pre wrap="">
Interesting idea. Maybe we should just mount over /sys/fs also.
</pre>
</blockquote>
<pre wrap="">
Well, note that we over-mount /sys with a tmpfs, and then some parts
of the real /sys into that. /sys/fs hence is just a subdir of our
private tmpfs. The tmpfs is marked r/o after everything is set up.
</pre>
<blockquote type="cite">
<pre wrap="">Do you just mount hugepages then during container setup?
</pre>
</blockquote>
<pre wrap="">
No. In nspawn, when we pass CAP_SYS_ADMIN to the container the
container will just mount /dev/hugepages correctly on its own. And we
do drop CAP_SYS_ADMIN then the ConditionCapability=CAP_SYS_ADMIN in
the unit file mentioned above will result in the mount being skipped
silently already.
Lennart
</pre>
</blockquote>
<br>
</body>
</html>