[systemd-devel] Default on failure dependencies

Baudouin Feildel baudouin_systemd at feildel.fr
Tue Oct 9 07:40:25 UTC 2018


6 octobre 2018 14:22 "Lennart Poettering" <lennart at poettering.net> a écrit:

> On Sa, 15.09.18 22:32, Baudouin Feildel (baudouin_systemd at feildel.fr) wrote:
> 
> (Sorry for not responding more timely, I have been travelling and am
> still catching up with all the email)

No problem, we are all busy with tons of emails those days... Thank you
for taking time to understand the need and give a complete answer.

>> Hello there,
>> 
>> Few weeks ago I opened the following issue in systemd repository:
>> https://github.com/systemd/systemd/issues/9373. Seeing no traction from
>> existing systemd developer,
> 
> Hmm, so, I figure we should have a discussion whether this really is
> desirable first, because I am not too sure about that I must say.
> 
> So far we are very conservative when it comes to options that are
> supposed to affect all units at once, as that tends to create various
> problems that are not obvious to solve. For example, if every service
> gets this kind of dep, what about the units that these deps are
> supposed to start, do you create a cyclic dep there?
> 
> Moreover, I figure the services pulled in like this are usually going
> to be late boot processes, but this means failures during early boot
> would result in a large number of queued services that need to be
> dispatched during late boot.
> 
> Moreover what happens if a service fails multiple times during early
> boot (for example because Restart= is used)? What happens with these
> failures, are the earlier ones dropped?
> 
> Also, what happens for services that fail during shutdown, would these
> also pull in new units? But if they do, then this would result in
> cyclic operations if the service to run is a regular service,
> i.e. needs all basic system stuff up: we are shutting down, but in
> order to process evreything that happened then we need to start
> services that reverse the shut down process as they require certain
> stuff to be up...
> 
> In general, there's the "philosophical incompatibility": stuff that
> is supposed to process failures in the service dependency logic,
> should probably not be part of the service dependency logic itself.

I completely agree with your analysis, I was worried about consequences I
cannot imagine. Now you show a lot a potential trouble and I agree that
if we can find another solution to the easy service monitoring problem that
would be better.

> This all makes me wonder whether a different approach to all of this
> wouldn't be better: maybe we should just consider this a logging
> problem: let's make sure we log a recognizable log message (i.e. a
> structured journal message with a well-defined MESSAGE_ID=) whenever a
> service fails. With that in place it should be relatively easy to
> write a system service that can run during regular system uptime and
> can look in the journal for all failures, including getting live
> notifications when something happens. Moreover, this resolves the
> problems during early and late boot: the "cursor" logic of the journal
> allows such a service to know exactly which failures it already
> processed and which ones are still left, and it can process all
> failures that took place while it was not running.
> 
> Does that make sense?
> 
> Lennart
> 
> --
> Lennart Poettering, Red Hat

Your proposal make sense. I will try to have a proof of concept in November.
October is already full of work.

I was thinking about another solution, but I am not sure if we have the
tooling available in systemd for that. I was thinking that pid1 can fire 
some kind of event available to any service. Then one could write a service
and subscribe to this event. Upon reception of the event the service could
do whatever they want. Does systemd have such event system? Maybe this could
be possible over D-Bus, but is this something sustainable?

Regards
Baudouin Feildel


More information about the systemd-devel mailing list