[systemd-devel] Allow stop jobs to be killed during shutdown

Andrey Borzenkov arvidjaar at gmail.com
Wed Jan 29 07:29:53 PST 2014


В Mon, 27 Jan 2014 15:48:50 +0100
Lennart Poettering <lennart at poettering.net> пишет:

> On Sun, 26.01.14 12:09, Andrey Borzenkov (arvidjaar at gmail.com) wrote:
> 
> > > > Any advices on how to do that?
> > > > I have both the issue (reproducible on each shutdown) and will to debug.
> > > 
> > > Well, enable the debug shell, and then from there try to figure out why
> > > things are stuck. i.e. whether it is systemd --user that really never
> > > exits. Or whether it actually exits but PID 1 doesn't notice it. And
> > > then if you figured out which of the two cases, you'd have to figure out
> > > why that is...
> > > 
> > I finally managed to reproduce it with user instance running with debug
> > level (before *any* attempt to add debugging, strace, whatever resulted
> > in problem disappearing).
> > 
> > It seems that /bin/kill -RTMIN+24 is being killed itself. I wonder - is
> > it possible that it is the same SIGTERM that is used by PID 1 to stop
> > user at 0service?
> 
> Ah, bummer! Yikes!
> 
> Thanks for tracking this done, this really sounds like you nailed the
> problem. Now, how to fix it?
> 
> Hmm, so, I would claim this is a shortcoming of
> KillMode=control-group, which is the default for everything. There has
> been an item on the TODO list to maybe introduce a KillMode=mixed
> setting, which would send SIGTERM only to the main process, and the
> SIGKILL later on to all processes. I am pretty sure that this would
> solve the issue at hand quite nicely here, because the systemd user
> instance would get a nice chance to clean up its own act, before the
> systemd system instance would make tabula rasa...
> 

I still favor alternative approach - let systemd wait for main PID
to exit after ExecStop instead. This is functionally equivalent to the
above with slight advantages

- it will probably decrease number of timeouts because systemd will go
  on killing spree as soon as main PID exits, not after final timeout.

- it is more generic as it allows any available method to trigger
  service stop, not just a signal.

Comments are welcome :)

From: Andrey Borzenkov <arvidjaar at gmail.com>
Subject: [PATCH] add WaitForMainPIDOnStop option

WaitForMainPIDOnStop=true will wait for exit of main PID in addition to
command set as ExecStop.

Use it in user at .service to allow user systemd to complete exit transaction
before starting final kill.

---
 man/systemd.service.xml               | 13 +++++++++++++
 src/core/dbus-service.c               |  1 +
 src/core/load-fragment-gperf.gperf.m4 |  1 +
 src/core/service.c                    | 15 +++++++++++++--
 src/core/service.h                    |  1 +
 units/user at .service.in                |  2 ++
 6 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/man/systemd.service.xml b/man/systemd.service.xml
index d316ab5..c121c95 100644
--- a/man/systemd.service.xml
+++ b/man/systemd.service.xml
@@ -1016,6 +1016,19 @@ ExecStart=/bin/echo $ONE $TWO ${TWO}
                                 <option>none</option>.</para></listitem>
                         </varlistentry>
 
+                        <varlistentry>
+                                <term><varname>WaitForMainPIDOnStop=</varname></term>
+
+                                <listitem><para>Takes a boolean value
+                                that specifies whether systemd should
+                                additionally wait for the main PID of a service
+                                to exit after executing ExecStop command.
+                                Default is to wait for completion of ExecStop
+                                command only.  Defaults to
+                                <option>no</option>.</para>
+                                </listitem>
+                        </varlistentry>
+
                 </variablelist>
 
                 <para>Check
diff --git a/src/core/dbus-service.c b/src/core/dbus-service.c
index 3db9339..35a3c2f 100644
--- a/src/core/dbus-service.c
+++ b/src/core/dbus-service.c
@@ -54,6 +54,7 @@ const sd_bus_vtable bus_service_vtable[] = {
         SD_BUS_PROPERTY("RootDirectoryStartOnly", "b", bus_property_get_bool, offsetof(Service, root_directory_start_only), SD_BUS_VTABLE_PROPERTY_CONST),
         SD_BUS_PROPERTY("RemainAfterExit", "b", bus_property_get_bool, offsetof(Service, remain_after_exit), SD_BUS_VTABLE_PROPERTY_CONST),
         SD_BUS_PROPERTY("GuessMainPID", "b", bus_property_get_bool, offsetof(Service, guess_main_pid), SD_BUS_VTABLE_PROPERTY_CONST),
+        SD_BUS_PROPERTY("WaitForMainPIDOnStop", "b", bus_property_get_bool, offsetof(Service, stop_waits_for_main_pid), SD_BUS_VTABLE_PROPERTY_CONST),
         SD_BUS_PROPERTY("MainPID", "u", bus_property_get_pid, offsetof(Service, main_pid), SD_BUS_VTABLE_PROPERTY_EMITS_CHANGE),
         SD_BUS_PROPERTY("ControlPID", "u", bus_property_get_pid, offsetof(Service, control_pid), SD_BUS_VTABLE_PROPERTY_EMITS_CHANGE),
         SD_BUS_PROPERTY("BusName", "s", NULL, offsetof(Service, bus_name), SD_BUS_VTABLE_PROPERTY_CONST),
diff --git a/src/core/load-fragment-gperf.gperf.m4 b/src/core/load-fragment-gperf.gperf.m4
index 59b2a64..85aaef7 100644
--- a/src/core/load-fragment-gperf.gperf.m4
+++ b/src/core/load-fragment-gperf.gperf.m4
@@ -167,6 +167,7 @@ Service.PermissionsStartOnly,    config_parse_bool,                  0,
 Service.RootDirectoryStartOnly,  config_parse_bool,                  0,                             offsetof(Service, root_directory_start_only)
 Service.RemainAfterExit,         config_parse_bool,                  0,                             offsetof(Service, remain_after_exit)
 Service.GuessMainPID,            config_parse_bool,                  0,                             offsetof(Service, guess_main_pid)
+Service.WaitForMainPIDOnStop,    config_parse_bool,                  0,                             offsetof(Service, stop_waits_for_main_pid)
 Service.RestartPreventExitStatus, config_parse_set_status,           0,                             offsetof(Service, restart_ignore_status)
 Service.SuccessExitStatus,       config_parse_set_status,            0,                             offsetof(Service, success_status)
 m4_ifdef(`HAVE_SYSV_COMPAT',
diff --git a/src/core/service.c b/src/core/service.c
index d949f7a..8e0eb6c 100644
--- a/src/core/service.c
+++ b/src/core/service.c
@@ -1269,6 +1269,7 @@ static void service_dump(Unit *u, FILE *f, const char *prefix) {
                 "%sRootDirectoryStartOnly: %s\n"
                 "%sRemainAfterExit: %s\n"
                 "%sGuessMainPID: %s\n"
+                "%sWaitForMainPIDOnStop: %s\n"
                 "%sType: %s\n"
                 "%sRestart: %s\n"
                 "%sNotifyAccess: %s\n",
@@ -1279,6 +1280,7 @@ static void service_dump(Unit *u, FILE *f, const char *prefix) {
                 prefix, yes_no(s->root_directory_start_only),
                 prefix, yes_no(s->remain_after_exit),
                 prefix, yes_no(s->guess_main_pid),
+                prefix, yes_no(s->stop_waits_for_main_pid),
                 prefix, service_type_to_string(s->type),
                 prefix, service_restart_to_string(s->restart),
                 prefix, notify_access_to_string(s->notify_access));
@@ -2953,11 +2955,17 @@ static void service_sigchld_event(Unit *u, pid_t pid, int code, int status) {
 
                         case SERVICE_START_POST:
                         case SERVICE_RELOAD:
-                        case SERVICE_STOP:
                                 /* Need to wait until the operation is
                                  * done */
                                 break;
 
+                        case SERVICE_STOP:
+                                /* If requested, wait for both main and control
+                                   PID to finish */
+                                if (s->stop_waits_for_main_pid && !control_pid_good(s))
+                                        service_enter_signal(s, SERVICE_STOP_SIGTERM, f);
+                                break;
+
                         case SERVICE_START:
                                 if (s->type == SERVICE_ONESHOT) {
                                         /* This was our main goal, so let's go on */
@@ -3116,7 +3124,10 @@ static void service_sigchld_event(Unit *u, pid_t pid, int code, int status) {
                                 break;
 
                         case SERVICE_STOP:
-                                service_enter_signal(s, SERVICE_STOP_SIGTERM, f);
+                                /* If requested, wait for both main and control
+                                   PID to finish */
+                                if (!s->stop_waits_for_main_pid || main_pid_good(s) <= 0)
+                                        service_enter_signal(s, SERVICE_STOP_SIGTERM, f);
                                 break;
 
                         case SERVICE_STOP_SIGTERM:
diff --git a/src/core/service.h b/src/core/service.h
index 1992926..919d54b 100644
--- a/src/core/service.h
+++ b/src/core/service.h
@@ -163,6 +163,7 @@ struct Service {
         bool root_directory_start_only;
         bool remain_after_exit;
         bool guess_main_pid;
+        bool stop_waits_for_main_pid;
 
         /* If we shut down, remember why */
         ServiceResult result;
diff --git a/units/user at .service.in b/units/user at .service.in
index bfc9b70..ac3caf3 100644
--- a/units/user at .service.in
+++ b/units/user at .service.in
@@ -14,4 +14,6 @@ User=%i
 PAMName=systemd-user
 Type=notify
 ExecStart=- at rootlibexecdir@/systemd --user
+ExecStop=@KILL@ -TERM $MANAGERPID
+WaitForMainPIDOnStop=yes
 Slice=user-%i.slice
-- 
tg: (2d5bdf5..) u/wait_for_mainPID_on_stop (depends on: master)


More information about the systemd-devel mailing list