[systemd-devel] unable to attach pid to service delegated directory in unified mode after restart

Felip Moll felip at schedmd.com
Mon Mar 14 22:12:58 UTC 2022


Hi folks. I continued with my investigation on the best way to solve my
problem.
As suggested I am calling StartTransientUnit method with dbus (using
libdbus), to start a new scope.
Below are my impressions.

Firing an async D-Bus packet to systemd should be hardly measurable.
>
> But note that you can also run your main service as a service, and
> then allocate a *single* scope unit for *all* your payloads.


The main issue is the scope needs a pid attached to it. I thought that the
scope could live without any process inside, but that's not happening.
So every time a user step/job finishes, my main process must take care of
it, and launch the scope again on the next coming job.
There's also a race condition when a job is finishing and another one is
starting up, at this point the scope can be destroyed but the main process
may not realize it.

I also tried to leave the responsibility of setting up the scope to the
forked process itself, which is much easier to code and cleaner because of
how the software is designed.
The forked process just does the dbus call, and when the scope is ready it
is moved to the corresponding cgroup (PIDFile=).

Problem number one: if other processes are in the scope, the dbus call
won't work since I am using the same name all the time, e.g.
slurmstepd.scope.
So I first need to check if the scope exists and if so put the new
slurmstepd process inside. But we still have the race condition, if during
this phase all steps ends, systemd will do the cleanup.

Problem number two, there's a significant delay since when creating the
scope, until it is ready and the pid attached into it. The only way it
worked was to put a 'sleep' after the dbus call and make my process wait
for the async call to dbus to be materialized. This is really un-elegant.


> That way
> you can restart your main service unit independently of the scope
> unit, but you only have to issue a single request once for allocating
> the scope, and not for each of your payloads.
>
>
Yes. That is solved, I can restart slurmd now, but the other part is not
true as I just explained.
I need to issue new requests every time the scope is cleaned up by systemd.


> But that too means you have to issue a bus call. If you really don't
> like talking to systemd this is not going to work of course, but quite
> frankly, that's a problem you are making yourself, and I am not
> particularly sympathetic to it.
>
>
This is not a problem, but the delay of creating a scope plus it being
removed all the time is unacceptable.

My only idea now is to start a scope from the main process, adding a "sleep
infinity" pid inside, and discharge anyone to ever creating or calling to
dbus.
If instead I could just ask systemd to delegate a part of the tree for my
processes, then everything would be solved.

Do you have any other suggestions?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/systemd-devel/attachments/20220314/87f43d3e/attachment.htm>


More information about the systemd-devel mailing list