[systemd-devel] Zombie process still exists after stopping gdm.service

Daniel Drake drake at endlessm.com
Fri Apr 17 13:04:18 PDT 2015


Hi,

I'm investigating why "systemctl stop gdm; Xorg" usually fails. The
new X process complains that X is still running.

Here's what I think is happening:

1. systemd sends SIGTERM to gdm to stop the service

2. gdm exits - it has a simple SIGTERM handler which just quits the
mainloop without doing any cleanup (as far as I can see, it doesn't
make any attempt to kill the child X server)

3. X exits because of PR_SET_PDEATHSIG (i.e. it's set to be
automatically killed when the parent goes away). The killed process
enters defunct state and is reparented to PID 1, presumably also
moving it out of the gdm cgroup.

4. systemd notes that gdm's cgroup is empty and decides that gdm is
now successfully stopped.

5. systemctl returns and now Xorg is launched immediately. Xorg reads
the PID of the old Xorg process from /tmp, and notices that that PID
is still in use (it is still an unreaped zombie) because kill()
doesn't return an error. Xorg aborts thinking that it is already
running.

6. Moments later, systemd reaps the zombie. Oops, too late.


Does that make sense?
I wonder how it is best to fix this. Is it a bug that systemd decided
that gdm.service had stopped before it had reaped zombie processes
that originally belonged to gdm?

Is it a gdm bug that killing gdm doesn't make any attempt to reap X
before going away itself? (they chose PR_SET_PDEATHSIG to do something
similar, but maybe we have to argue that it is not quite sufficient)

Thanks
Daniel


More information about the systemd-devel mailing list