[systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

Michael Chapman mike at very.puzzling.org
Mon Jul 10 12:27:33 UTC 2017


On Mon, 10 Jul 2017, Lennart Poettering wrote:
> On Mon, 10.07.17 21:51, Michael Chapman (mike at very.puzzling.org) wrote:
>
>>> This all stems from my experiences with PulseAudio back in the day:
>>> People do not grok the effect of fork(): it only duplicates the
>>> invoking thread, not any other threads of the process, moreover all
>>> data structures are copied as they are, and that's a time bomb really:
>>> consider one of our context objects is being used by one thread at the
>>> moment another thread invokes fork(): the thread using the object is
>>> busy making changes to the object, rearranging some datastructure (for
>>> example, rehashing a hash table, because it hit its fill limit) and
>>> suchlike. Now the fork() happens while it is doing that: the data
>>> structure will be copied in its half-written, half-updated status quo,
>>> and in the child process there's no thread that could finish what has
>>> been started, and there's neither a way to rollback the changes that
>>> are in progress.
>> [...]
>>
>> Thanks, that really does clear things up.
>>
>> It's a pity glibc doesn't provide an equivalent for pthread_atfork() outside
>> of the pthread library. Having a notification that a fork has just occurred
>> would allow us to do the PID caching ourselves.
>
> Well, pthread_atfork() is probably more a source of problems than a solution
> for them.
>
> Mutexes and fork() do not mix well: if you have a thread that acquired
> a mutex right before a fork() then it will cease to exist but the
> mutex remains locked. Now, you could use pthread_atfork() to unlock
> it, but that really works only in trivial cases, with trivial data
> structures, and otherwise creates ABBA problems and similar. I mean,
> mutexes are supposed to make pieces of code atomic from the outside
> view: but if you duplicate a process without the thread it will appear
> aborted to the outside, and that's quite far from "atomic"...

I understand that... which is why I was only talking about PID caching. 
That is, it could be used to avoid the getpid() calls.

Anyway, it's all moot as I don't think we'd want to use pthread_atfork in 
any systemd APIs -- I'm not sure if they all link to libpthread yet 
anyway.

>> Of course, there's still a problem with people calling the clone syscall
>> directly... but I think once people start doing that we have to trust them
>> to know what they're doing.
>
> Yes: if you invoke clone() directly, you should really invoke execve()
> too soon, and in the time between these two syscalls you should not
> invoke getpid() and limit yourself to known safe calls.
>
> Lennart



More information about the systemd-devel mailing list