[systemd-devel] nsenter and SIGSTOP

Zbigniew Jędrzejewski-Szmek zbyszek at in.waw.pl
Sat Apr 20 11:23:19 PDT 2013


Hi,
I've hit a bit of a problem with nsenter and systemd-nspawn.
When nsenter is used to enter the PID namespace created with
systemd-nspawn, and the container's init attempts a shutdown,
it hangs because nsenter is suspended.

The sequence of events leading to the hang is:

1. nsenter launches a shell inside the container with
   PPID=0 as seen inside the container,
2. systemd with PID=1 goes through the shutdown sequence,
   issuing the equivalent(*) of

   kill(-1, SIGSTOP)
   kill(-1, SIGTERM)
   kill(_1, SIGCONT)
   reboot(RB_HALT_SYSTEM)

Now, nsenter has a stanza in continue_as_child where it stops itself
when the child gets stopped. Unfortunately, this means that nsenter
gets stopped in response to kill(-1, SIGSTOP) which hits the child.
Then the child dies on kill(-1, SIGTERM), is resumed with kill(-1,
SIGCONT) and exits (it prints "exit", so it's easy to see that it
terminated properly. Then the shell becomes a zombie, since nsenter it
it's parent and it's sleeping. Meanwhile, init executes reboot, and
hangs in there, since the container waits for the PID namespace to
become empty (I'm guessing here, but that seems logical). When then
I type 'fg' to continue nsenter, the child gets collected and the
container successfully exits.

This is with kernel 3.9-rc6 from Fedora.

Thanks,
Zbyszek


More information about the systemd-devel mailing list