[systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging

Ivan Shapovalov intelfx100 at gmail.com
Fri Apr 24 10:06:29 PDT 2015


On 2015-04-24 at 16:20 +0200, Lennart Poettering wrote:

> On Fri, 24.04.15 16:04, Lennart Poettering (lennart at poettering.net)
> wrote:
> 

> > On Fri, 24.04.15 15:52, Lennart Poettering (lennart at poettering.net
> > ) wrote:
> > 

> > > before we coldplug a unit, we should coldplug all units it might
> > > trigger, which are those with a listed UNIT_TRIGGERS dependency,
> > > as
> > > well as all those that retroactively_start_dependencies() and
> > > retroactively_stop_dependencies() operates on. Of course, we
> > > should
> > > also avoid running in loops here, but that should be easy by
> > > keeping a
> > > per-unit coldplug boolean around.
> > 
> > Actually, it really is about the UNIT_TRIGGERS dependencies only,
> > since we don't do the retroactive deps stuff at all when we are
> > coldplugging, it's conditionalized in m->n_reloading <= 0.
> 
> I have implemented this now in git:
> 
> http://cgit.freedesktop.org/systemd/systemd/commit/?id=f78f265f405a61
> 387c6c12a879ac0d6b6dc958db
> 
> Ivan, any chance you can check if this fixes your issue? (Not sure it
> does, because I must admit I am not entirely sure I really understood
> it fully...)

Seems like it didn't help.
I use the following patch to alter coldplugging order slightly (it's a
hashmap, so order is actually arbitrary, so this alteration is valid):

==== cut patch here ====
diff --git a/src/core/manager.c b/src/core/manager.c
index f13dad5..542dd4f 100644
--- a/src/core/manager.c
+++ b/src/core/manager.c
@@ -975,6 +975,10 @@ int manager_enumerate(Manager *m) {
         return r;
 }
 
+static bool coldplug_first(Unit *u) {
+        return !(endswith(u->id, ".service") || endswith(u->id, ".target"));
+}
+
 static void manager_coldplug(Manager *m) {
         Iterator i;
         Unit *u;
@@ -990,6 +994,26 @@ static void manager_coldplug(Manager *m) {
                 if (u->id != k)
                         continue;
 
+                /* we need a reproducer */
+                if (!coldplug_first(u))
+                        continue;
+
+                r = unit_coldplug(u);
+                if (r < 0)
+                        log_warning_errno(r, "We couldn't coldplug %s, proceeding anyway: %m", u->id);
+        }
+
+        /* Process remaining units. */
+        HASHMAP_FOREACH_KEY(u, k, m->units, i) {
+
+                /* ignore aliases */
+                if (u->id != k)
+                        continue;
+
+                /* skip already processed units */
+                if (coldplug_first(u))
+                        continue;
+
                 r = unit_coldplug(u);
                 if (r < 0)
                         log_warning_errno(r, "We couldn't coldplug %s, proceeding anyway: %m", u->id);
==== cut patch here ====

With this patch applied, on `systemctl daemon-reload` I get the following:

==== cut log here ====
2015-04-24T19:42:05+0300 intelfx-laptop sudo[15870]: intelfx : TTY=pts/3 ; PWD=/home/intelfx/tmp/build/systemd ; USER=root ; COMMAND=/usr/bin/systemctl daemon-reload
2015-04-24T19:42:05+0300 intelfx-laptop sudo[15870]: pam_unix(sudo:session): session opened for user root by intelfx(uid=0)
2015-04-24T19:42:05+0300 intelfx-laptop polkitd[8629]: Registered Authentication Agent for unix-process:15871:1490725 (system bus name :1.239 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale ru_RU.utf8)
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reloading.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Listening on /dev/initctl Compatibility Named Pipe.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device LITEONIT_LSS-16L6G EFI.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Listening on fsck to fsckd communication Socket.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Set up automount var-lib-pacman-sync.automount.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Daily Cleanup of Temporary Directories.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Listening on RPCbind Server Activation Socket.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 datastore0.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 linux-build.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 datastore0.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device LITEONIT_LSS-16L6G EFI.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Activated swap Swap Partition.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 datastore0.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 linux-build.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Mounted POSIX Message Queue File System.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Created slice System Slice.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Mounted /home.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device LITEONIT_LSS-16L6G swap0.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 datastore0.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 linux-build.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started File System Check on /dev/disk/by-label/linux-build.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Rebuild Journal Catalog.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Set Up Additional Binary Formats.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Rebuild Hardware Database.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reached target Paths.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Setup Virtual Console.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Create Static Device Nodes in /dev.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Rebuild Dynamic Linker Cache.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reached target Basic System.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Create Volatile Files and Directories.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Create System Users.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reached target Swap.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reached target System Time Synchronized.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started udev Coldplug all Devices.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Update UTMP about System Boot/Shutdown.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reached target Slices.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reached target System Initialization.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reached target Local File Systems (Pre).
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reached target Local File Systems.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Create list of required static device nodes for the current kernel.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Update is Completed.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reached target Encrypted Volumes.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Flush Journal to Persistent Storage.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Remount Root and Kernel File Systems.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Load/Save Random Seed.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Apply Kernel Variables.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reached target Timers.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Load Kernel Modules.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reached target Sockets.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started File System Check on /dev/disk/by-label/datastore0.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Network Time Synchronization.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Mounted Virtual Machine and Container Storage.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Commit a transient machine-id on disk.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started First Boot Wizard.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Listening on Journal Audit Socket.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Manage Sound Card State (restore and store).
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Starting Restore Sound Card State...
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started File System Check on Root Device.
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started CUPS Scheduler.
2015-04-24T19:42:05+0300 intelfx-laptop sudo[15870]: pam_unix(sudo:session): session closed for user root
2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Restore Sound Card State.
2015-04-24T19:42:05+0300 intelfx-laptop polkitd[8629]: Unregistered Authentication Agent for unix-process:15871:1490725 (system bus name :1.239, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale ru_RU.utf8) (disconnected from bus)
==== cut log here ====

To reproduce, it takes three things:
- the order alteration patch (without it, bug may also happen, but it will be
  nondeterministic wrt different sets of units);
- any *.path unit which active at the moment of reloading
  (in my case, it is org.cups.cupsd.path);
- any Type=oneshot / RemainAfterExit=false / WantedBy=basic.target unit
  (in my case, it is alsa-restore.service), though without it you'll still
  get the above messages due to basic.target being re-started.

I'll try to look into code and see why your method fails...

-- 
Ivan Shapovalov / intelfx /

-- 
Ivan Shapovalov / intelfx /

-- 
Ivan Shapovalov / intelfx /
-- 
Ivan Shapovalov / intelfx /
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 213 bytes
Desc: This is a digitally signed message part
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20150424/83502e62/attachment.sig>


More information about the systemd-devel mailing list