[systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging

Fri Apr 24 07:23:37 PDT 2015

not yet marked)On 2015-04-24 at 15:52 +0200, Lennart Poettering wrote:
> On Wed, 25.02.15 21:40, Ivan Shapovalov (intelfx100 at gmail.com) wrote:
> 
> Ivan,
> 
> > Because the order of coldplugging is not defined, we can reference 
> > a
> > not-yet-coldplugged unit and read its state while it has not yet 
> > been
> > set to a meaningful value.
> > 
> > This way, already active units may get started again.
> 
> > We fix this by deferring such actions until all units have been at 
> > least
> > somehow coldplugged.
> > 
> > Fixes https://bugs.freedesktop.org/show_bug.cgi?id=88401
> 
> Hmm, so firstly, in this case, do those two alsa services 
> have RemainAfterExit=yes set? I mean, if they have not, they really
> should. I they have, then queuing jobs for them a second time is not
> really an issue, because the services are already running they will 
> be
> eaten up eventually.

They do not, but this is actually irrelevant to the root issue.
Setting RemainAfterExit=yes will simply hide it. Actually, in this bug
the basic.target is started second time.

IOW, the point is: no matter what is the configuration of units, none
of them should be re-run on reload (given no configuration changes).

> 
> But regarding the patch:
> 
> I am sorry, but we really should find a different way to fix this, if
> there really is an issue to fix...
> 
> I really don't like the hashmap that maps units to functions. I mean,
> the whole concept of unit vtables exists to encapsulate the
> unit-type-specific functions, and we should not add different place
> for configuring those.

I agree, this is not the cleanest solution. But at least it gets the
semantics right, and I've waited for comments/suggestions for ~1month
before actually sending this patch... Revert as you see fit, let's
just make sure we eventually come up with a solution.

> 
> Also, and more importantly, the whole coldplug function exists only 
> to
> seperate the unit loading and initial state changing into two 
> separate
> steps, so that we know that all units are fully loaded before we 
> start
> making state chnages.
> 
> Now, I see that this falls short of the issue at hand here,

Yes, exactly. The problem is that during coldplug we may accidentally
look at the state of not-yet-coldplugged units.

> but I
> think the right fix is really to alter the order in which we
> coldplug. More specifically, I think we need to make the coldplugging
> order dependency aware:
> 
> before we coldplug a unit, we should coldplug all units it might
> trigger, which are those with a listed UNIT_TRIGGERS dependency, as
> well as all those that retroactively_start_dependencies() and
> retroactively_stop_dependencies() operates on. Of course, we should
> also avoid running in loops here, but that should be easy by keeping 
> a
> per-unit coldplug boolean around.
> 
> Anyway, I reverted the patch for now, this really needs more 
> thinking.

I think I agree with this idea. I just didn't know how to handle
potentially unbounded recursion. Maybe we can do something along these
lines (pseudocode):

while (any units left to coldplug)
    for (unit in hashmap)
        if (not yet marked)
            if (all deps of unit are coldplugged)
                coldplug_and_mark(unit);

That is, we will repeatedly loop over hashmap, coldplugging only those
units whose UNIT_TRIGGERS are already coldplugged, and leaving other
units for next outer iteration.

Makes sense?

-- 
Ivan Shapovalov / intelfx /
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 213 bytes
Desc: This is a digitally signed message part
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20150424/085c5422/attachment.sig>