[systemd-devel] Allow stop jobs to be killed during shutdown

Andrey Borzenkov arvidjaar at gmail.com
Sun Jan 26 05:18:28 PST 2014


В Sun, 26 Jan 2014 12:09:23 +0400
Andrey Borzenkov <arvidjaar at gmail.com> пишет:

> В Fri, 24 Jan 2014 18:46:06 +0100
> Lennart Poettering <lennart at poettering.net> пишет:
> 
> > On Fri, 24.01.14 21:10, Ivan Shapovalov (intelfx100 at gmail.com) wrote:
> > 
> > > > > > However, something like that can never be the default, we need to give
> > > > > > services the chance to shut down cleanly and in the right order
> > > > > 
> > > > > then bugs like https://bugzilla.redhat.com/show_bug.cgi?id=1023820
> > > > 
> > > > I have so far never encountered this issue, but I fear this is a bug
> > > > where somebody who can reproduce this needs to sit down and debug a
> > > > bit...
> > > > 
> > > > Lennart
> > > 
> > > Any advices on how to do that?
> > > I have both the issue (reproducible on each shutdown) and will to debug.
> > 
> > Well, enable the debug shell, and then from there try to figure out why
> > things are stuck. i.e. whether it is systemd --user that really never
> > exits. Or whether it actually exits but PID 1 doesn't notice it. And
> > then if you figured out which of the two cases, you'd have to figure out
> > why that is...
> > 
> 
> 
> I finally managed to reproduce it with user instance running with debug
> level (before *any* attempt to add debugging, strace, whatever resulted
> in problem disappearing).
> 
> It seems that /bin/kill -RTMIN+24 is being killed itself. I wonder - is
> it possible that it is the same SIGTERM that is used by PID 1 to stop
> user at 0service?
> 

I'm almost sure it is. cg_kill_recursive is in no way atomic, so it can
easily hit new process that was spawned since service stop had been
initiated.

Unfortunately, setting KillMode=process is not allowed:

Jan 26 17:12:30 linux-1a7f systemd[1]: user at 0.service has PAM enabled. Kill mode must be set to 'control-group'. Refusing.

Probably user at .service should be exempt from this rule. It is supposed
to handle all services started by it itself, it *is* service manager
after all? 

> Jan 26 11:53:58 linux-1a7f systemd[1942]: Received SIGTERM from PID 1 (systemd).
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Activating special unit exit.target
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Trying to enqueue job exit.target/start/replace
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Installed new job exit.target/start as 3
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Installed new job systemd-exit.service/start as 4
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Installed new job shutdown.target/start as 5
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Installed new job default.target/stop as 7
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Enqueued job exit.target/start as 3
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Stopping Default.
> Jan 26 11:53:58 linux-1a7f systemd[1942]: default.target changed active -> dead
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Job default.target/stop finished, result=done
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Stopped target Default.
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Starting Shutdown.
> Jan 26 11:53:58 linux-1a7f systemd[1942]: shutdown.target changed dead -> active
> Jan 26 11:53:58 linux-1a7f systemd[1942]: Job shutdown.target/start finished, result=done
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Reached target Shutdown.
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Starting Exit the Session...
> Jan 26 11:53:59 linux-1a7f systemd[1942]: About to execute: /usr/bin/kill -s 58 $MANAGERPID
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Forked /usr/bin/kill as 1951
> Jan 26 11:53:59 linux-1a7f systemd[1942]: systemd-exit.service changed dead -> start
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Set up jobs progress timerfd.
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Collecting default.target
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Received SIGCHLD from PID 1943 ((sd-pam)).
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Got SIGCHLD for process 1943 ((sd-pam))
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Child 1943 died (code=exited, status=0/SUCCESS)
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Received SIGCHLD from PID 1951 ((kill)).
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Got SIGCHLD for process 1951 ((kill))
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Child 1951 died (code=killed, status=15/TERM)
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Child 1951 belongs to systemd-exit.service
> Jan 26 11:53:59 linux-1a7f systemd[1942]: systemd-exit.service: main process exited, code=killed, status=15/TERM
> Jan 26 11:53:59 linux-1a7f systemd[1942]: systemd-exit.service changed start -> dead
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Job systemd-exit.service/start finished, result=done
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Started Exit the Session.
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Closed jobs progress timerfd.
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Starting Exit the Session.
> Jan 26 11:53:59 linux-1a7f systemd[1942]: exit.target changed dead -> active
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Job exit.target/start finished, result=done
> Jan 26 11:53:59 linux-1a7f systemd[1942]: Reached target Exit the Session.



More information about the systemd-devel mailing list