[systemd-devel] nsenter and SIGSTOP

Eric W. Biederman ebiederm at xmission.com
Sat Apr 20 15:27:46 PDT 2013


Zbigniew Jędrzejewski-Szmek <zbyszek at in.waw.pl> writes:

> Hi,
> I've hit a bit of a problem with nsenter and systemd-nspawn.
> When nsenter is used to enter the PID namespace created with
> systemd-nspawn, and the container's init attempts a shutdown,
> it hangs because nsenter is suspended.
>
> The sequence of events leading to the hang is:
>
> 1. nsenter launches a shell inside the container with
>    PPID=0 as seen inside the container,
> 2. systemd with PID=1 goes through the shutdown sequence,
>    issuing the equivalent(*) of
>
>    kill(-1, SIGSTOP)

This baffles me.  I am not certain why someone whould send SIGSTOP
when the want processes to exit.  I'm not even saying it's wrong just
saying that is odd.

>    kill(-1, SIGTERM)
>    kill(_1, SIGCONT)
>    reboot(RB_HALT_SYSTEM)
>
> Now, nsenter has a stanza in continue_as_child where it stops itself
> when the child gets stopped. Unfortunately, this means that nsenter
> gets stopped in response to kill(-1, SIGSTOP) which hits the child.
> Then the child dies on kill(-1, SIGTERM), is resumed with kill(-1,
> SIGCONT) and exits (it prints "exit", so it's easy to see that it
> terminated properly. Then the shell becomes a zombie, since nsenter it
> it's parent and it's sleeping. Meanwhile, init executes reboot, and
> hangs in there, since the container waits for the PID namespace to
> become empty (I'm guessing here, but that seems logical).

I expect the hang is in the pid namespace init exiting.
in kernel/pid_namespace.x:zap_pid_ns_processes() has the baviour of
blocking until all children of init have been reaped that you describe.

> When then
> I type 'fg' to continue nsenter, the child gets collected and the
> container successfully exits.
>
> This is with kernel 3.9-rc6 from Fedora.

For nsenter and the pid namespace they are working as designed.  But
given this outcode it would be nice if we could get a SIGCONT when the
child wakes up again.

The current behavior supports being able to type suspend in your shell
in the namespace and able to work outside the namespace.

I can't think of a way off the top of my head to wake nsenter up when
it's child is woken up underneath it, but it sounds like that would be
nice to do.

For the short term I would recommend not typing "reboot & exit" instead
of "reboot" from a shell started with nsenter, and otherwise not leaving
processes with parents outside the pid namespace around.

Of course that seding SIGSTOP before sending SIGTERM seems mighty fishy
as well.

Eric


More information about the systemd-devel mailing list