[systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

Lennart Poettering lennart at poettering.net
Mon Jul 10 11:39:50 UTC 2017


On Sat, 08.07.17 16:24, Michael Chapman (mike at very.puzzling.org) wrote:

> On Sat, 8 Jul 2017, vcaputo at pengaru.com wrote:
> > In doing some casual journalctl profiling and stracing, it became apparent
> > that `journalctl -b --no-pager` runs across a significant quantity of logs,
> > ~10% of the time was thrown away on getpid() calls due to commmit a65f06b.
> > 
> > As-is:
> > # time ./journalctl -b --no-pager > /dev/null
> > 
> > real    0m11.033s
> > user    0m10.084s
> > sys     0m0.943s
> > 
> > 
> > After changing journal_pid_changed() to simply return 1:
> > # time ./journalctl -b --no-pager > /dev/null
> > 
> >  real    0m9.641s
> >  user    0m9.449s
> >  sys     0m0.191s
> 
> [...]
> 
> > As this is public sd-journal API, it's somewhat set in stone.
> 
> So it's arguable whether making an API work in _more_ situations than it
> previously did is a "breaking" change.
> 
> I've tried to go through the history for the various *_pid_changed()
> functions in the APIs systemd presents, and I'm struggling to find a good
> justification for them. It seems like it was originally added for sd-bus in:
> 
>   https://github.com/systemd/systemd/commit/d5a2b9a6f455468a0f29483303657ab4fd7013d8
> 
> And then other APIs copied it to be consistent with sd-bus:
> 
>   https://github.com/systemd/systemd/commit/a65f06bb27688a6738f2f94b7f055f4c66768d63
>   https://github.com/systemd/systemd/commit/eaa3cbef3b8c214cd5c2d75b04e70ad477187e17
>   https://github.com/systemd/systemd/commit/adf412b9ec7292e0c83aaf9ab93e08c2c8bd524a
> 
> Unfortunately none of these commits describe what will go wrong if one of
> these APIs is used across fork. Does anybody know what specifically is the
> problem being addressed here? Can we detect this problem in some
> other way?

This all stems from my experiences with PulseAudio back in the day:
People do not grok the effect of fork(): it only duplicates the
invoking thread, not any other threads of the process, moreover all
data structures are copied as they are, and that's a time bomb really:
consider one of our context objects is being used by one thread at the
moment another thread invokes fork(): the thread using the object is
busy making changes to the object, rearranging some datastructure (for
example, rehashing a hash table, because it hit its fill limit) and
suchlike. Now the fork() happens while it is doing that: the data
structure will be copied in its half-written, half-updated status quo,
and in the child process there's no thread that could finish what has
been started, and there's neither a way to rollback the changes that
are in progress.

In order to protect us against the time bomb we hence have these
checks that error our early and explicitly on all our calls, with
clean error codes as soon as a fork() is detected but people try to
reuse our objects.

Now, you could argue, that on purely single-threaded processes all
this is too restrictive, but it's really not like that anymore, as many
libraries (including our own, for example in sd-resolve) fork off
background threads, and invoke our code in them. Never forget that our
libraries such as sd-bus and so on or invoked by NSS, which is very
frequently invoked in threads (because NSS is synchronous, which is
very much incompatible with most graphical spps).

So yeah, we never know in which context our stuff is called, and hence
I think it's safe to refuse in a friendly way early on rather than
just wait for the time bomb to explode when our code touches those
objects again.

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list