[systemd-devel] Zombie process still exists after stopping gdm.service

Daniel Drake drake at endlessm.com
Tue Apr 21 12:25:10 PDT 2015


On Mon, Apr 20, 2015 at 6:29 PM, Lennart Poettering
<lennart at poettering.net> wrote:
> Sure, we don't want to keep track of which processes we already
> killed, to distuingish them from the processes newly created in the
> time between our sending of SIGTERM and receiving SIGCHLD for the main
> process.
>
> We assume that if we get SIGCHLD for the main process that the daemon
> is down, and everything that is left over then is auxiliary stuff we
> can kill.

OK, doesn't sound unreasonable. Once we get to the end of this topic,
I'll submit a documentation patch to make that a bit clearer.

So, of the 3 signals (TERM, TERM, KILL) sent to gdm-simple-slave
within a total time of 0.01s, we have good explanations for the first
2.

The 3rd one (KILL) is still suspicious to me though. It is sent 0.4ms
after the preceding SIGTERM, here is what happens in the code:

1. gdm's main process exits due to the first SIGTERM. systemd becomes
aware in service_sigchld_event(), and responds as follows:

                        case SERVICE_STOP_SIGTERM:
                        case SERVICE_STOP_SIGKILL:
                                if (!control_pid_good(s))
                                        service_enter_stop_post(s, f);

2. Inside service_enter_stop post, there is no command to execute, so we call:
                service_enter_signal(s, SERVICE_FINAL_SIGTERM, SERVICE_SUCCESS);

3. service_enter_signal calls unit_kill_context() to send the second
SIGTERM. Looking at what happens inside unit_kill_context(): there is
no main process, nor control process, so we go straight to the cgroup
killing. The cgroup kill happens without error, and we reach the end
of the function:

        return wait_for_exit;

wait_for_exit was not modified from its intial value (false) during
the course of the function, so false is returned here.

4. Back in service_enter_signal, since unit_kill_context returned
false, we do not arm the timer. Without hesitation systemd goes
directly and sends SIGKILL.

        } else if (state == SERVICE_FINAL_SIGTERM)
                service_enter_signal(s, SERVICE_FINAL_SIGKILL, SERVICE_SUCCESS)


I can understand that once the main PID goes away, systemd feels
welcome to get heavy handed with the remaining processes. But doing
SIGTERM and then immediately SIGKILL just a few microseconds later
seems strange - why not go straight for the SIGKILL?

There's a comment in unit_kill_context() which looks relevant here:

                        /* FIXME: For now, we will not wait for the
                         * cgroup members to die, simply because
                         * cgroup notification is unreliable. It
                         * doesn't work at all in containers, and
                         * outside of containers it can be confused
                         * easily by leaving directories in the
                         * cgroup. */

                        /* wait_for_exit = true; */

If that were uncommented, the above behaviour would be different.

Daniel


More information about the systemd-devel mailing list