[systemd-devel] systemd-nspawn/LXC containers & pam login failure
Lennart Poettering
lennart at poettering.net
Thu May 9 06:32:09 PDT 2013
On Thu, 09.05.13 11:38, Daniel P. Berrange (berrange at redhat.com) wrote:
> Following the suggestion in the systemd-nspawn manpage I populated
> a mini Fedora 19 chroot, on a Fedora 19 host
>
> # yum -y --releasever=19 --nogpg --installroot=/srv/mycontainer \
> --disablerepo='*' --enablerepo=fedora \
> install systemd passwd yum fedora-release vim-minimal
> # chroot /srv/mycontainer passwd
> # systemd-nspawn -bD /srv/mycontainer
>
> Systemd boots up nicely & presents a login prompt, but it is impossible
> to actually login, PAM always denying the attempts.
Yeah, this is a known problem. We generally suggest to turn off audit
by booting with audit=0 on the kernel cmdline for now:
https://fedoraproject.org/wiki/Features/SystemdLightweightContainers
I guess I should add a comment about this to nspawn's man page too.
The audit folks are working on adding container awareness to the audit
subsystem in the kernel (which basically means that audit messages carry
the outside PID of PID1 of the container, so that auditd can track this
properly). Currently audit is completely confused by PID
namespacing. Also, we want them to fix for us that opening a PID
namespace resets loginuid in the container to -1. We have discussed this
several times with them, and they wanted to something about it, but so
far nothing happened. But we'll have another meeting about this next
week, so I can put some pressure on this.
> Debugging this, there seem to be two issues
>
> 1. pam_loginuid.so tries to write to /proc/self/loginuid but is denied
> by the kernel.
>
> My kernel has CONFIG_AUDIT_LOGINUID_IMMUTABLE=y which means once a
> loginuid is set (in this case from my ssh session into the host),
> it can't be changed (eg by the 'login' process inside the container).
> From the KConfig comment, this appears to have been a new feature
> built explicitly for systemd based hosts.
>
> The loginuid appears to be inherited across fork/exec so, AFAICT,
> the only way to avoid this is to spawn the container from something
> which does not already have a loginuid set, eg systemd itself or
> some other process not associated with a login session.
>
> Not being able to spawn containers from a login session on the host
> is kind of a PITA for development / debuging :-(
>
> Seems we need to find a way to have systemd-nspawn ensure that the
> 'init' process inside the container does not have a 'loginuid' set,
> even if the thing starting the container does. On the flipside, it
> seems this would violate the kernel security design for this feature ?
>
> If that were the case, then the pam_loginuid module might need to
> be made a no-op inside containers.
The right approach to me here is the aforementioned resetting of
loginuid when a new PID namespace is opened.
> 2. The audit_log_acct_message() method which is called by pretty
> much any PAM module returns EPERM
>
> There is no actual syscall returning EPERM here. The EPERM
> appears to be coming back inside the netlink reply message
> from the kernel audit subsystem. Since pretty much every PAM
> module sends audit messages, this causes them all to return
> fatal errors, failing the login attempt
>
> The _pam_audit_writelog() method does have code to ignore
> EPERM, but it only does so if 'getuid() != 0'. The container
> login process has uid == 0, so EPERM is treated as fatal. The
> "easy" (but not neccessarily correct) fix is to change
>
> diff -rup Linux-PAM-1.1.6.orig/libpam/pam_audit.c Linux-PAM-1.1.6.new/libpam/pam_audit.c
> --- Linux-PAM-1.1.6.orig/libpam/pam_audit.c 2012-08-15 12:08:43.000000000 +0100
> +++ Linux-PAM-1.1.6.new/libpam/pam_audit.c 2013-05-09 10:17:48.679403471 +0100
> @@ -46,7 +46,7 @@ _pam_audit_writelog(pam_handle_t *pamh,
> pamh->audit_state |= PAMAUDIT_LOGGED;
>
> if (rc < 0) {
> - if (rc == -EPERM && getuid() != 0)
> + if (rc == -EPERM)
> return 0;
> if (errno != old_errno) {
> old_errno = errno;
I tried to get a patch like this into PAM actually, but Steve (of
course) said nononono! He's really married to the idea that audit breaks
everything on any kind of error... This is kinda sad though, as
otherwise this would have allowed us to turn off auditing in the
container completely by removing CAP_AUDIT_CONTROL and CAP_AUDIT_WRITE
of the container...
I guess libvirt-lxc is in a slightly better situation here regarding
audit, since it never tries to spawn a container as child of a login
session, hence loginuid will not be sealed off yet...
Lennart
--
Lennart Poettering - Red Hat, Inc.
More information about the systemd-devel
mailing list