[systemd-devel] Bizarre issue with logins and cgroups?

Ryan rymg19 at gmail.com
Sat Mar 28 19:02:19 UTC 2020


I don't yet have a small test for this yet, so here's all the information
I've found while I get that ready:

I have a side project <https://nsbox.dev/> that revolves around using
systemd-nspawn to run pet containers. One feature I'm trying to use it for
is booted containers, where the following happens:

- During container boot, a service is run that creates an account inside
the container corresponding to the outside user. This service depends on
multi-user.target, as well as console-getty (which is overridden to enable
autologin).
- The service inside signals the outside world when it's done that the
container is ready for login.
- Once the signal is received outside, the host uses nsenter to enter the
container namespace, then runs

  runuser -s /bin/bash -- - "$THE_USER_NAME" some-command

Here's the bizarre part: runuser just hangs forever. I went into debugging
it further, and found it was hanging waiting for a response from
systemd-logind while executing the PAM config. With verbose logging for
logind enabled, I observed the following:

  Mar 27 01:04:35 test-boot systemd[1]: Failed to start Session 47 of user
vagrant.

Looking further up:

  Mar 27 01:04:35 test-boot systemd-logind[25]: Sent message
type=method_call sender=n/a destination=org.freedesktop.systemd1
path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.>
  Mar 27 01:04:35 test-boot systemd[1]: session-47.scope: Failed to add
PIDs to scope's control group: No such file or directory
  Mar 27 01:04:35 test-boot systemd[1]: session-47.scope: Failed with
result 'resources'

With verbose logging for systemd itself available, I observed the following
(this was on cgroups v1, but the same error appears with v2):

  Mar 27 20:27:36 test-boot systemd[1]: session-3.scope: Failed to set
'memory.limit_in_bytes' attribute on
'/user.slice/user-1000.slice/session-3.scope' to '-1': No such file or
directory
  Mar 27 20:27:36 test-boot systemd[1]: session-3.scope: Failed to set
'pids.max' attribute on '/user.slice/user-1000.slice/session-3.scope' to
'max': No such file or directory
  Mar 27 20:27:36 test-boot systemd[1]: session-3.scope: Couldn't move
process 73 to requested cgroup
'/user.slice/user-1000.slice/session-3.scope': No such file or directory
  Mar 27 20:27:36 test-boot systemd[1]: session-3.scope: Failed to add PIDs
to scope's control group: No such file or directory
  Mar 27 20:27:36 test-boot systemd[1]: session-3.scope: Failed with result
'resources'.

So...it seems to be getting stuck on moving processes to the new session's
scope's cgroup? Here's the weird part: if I wait around 10-15 seconds or
so, then run it again...it works. This would lead me to believe there's a
race somewhere, but I can't find anything implying that in the logs.

At this point I'm just stumped. I'm working to reproduce it with just pure
nspawn, but in the mean time, I was curious if anyone here had an idea on
where this could be coming from?

-- 
Ryan
https://refi64.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20200328/d3e83d20/attachment.htm>


More information about the systemd-devel mailing list