[systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

Florian Weimer fw at deneb.enyo.de
Wed Jul 12 09:47:38 UTC 2017


* Lennart Poettering:

> On Wed, 12.07.17 09:51, Florian Weimer (fw at deneb.enyo.de) wrote:
>
>> * Lennart Poettering:
>> 
>> > On Tue, 11.07.17 21:26, Florian Weimer (fw at deneb.enyo.de) wrote:
>> >
>> >> * Lennart Poettering:
>> >> 
>> >> > Apparently, this regressed between this version and
>> >> > glibc-2.24-9.fc25.x86_64 hence.
>> >> 
>> >> Yes, I backported the fork cache removal to Fedora 25.  There is no
>> >> longer a good way to main such a cache in userspace because glibc
>> >> cannot intercept anymore all the ways that can change the PID of the
>> >> current process because the kernel interfaces for process management
>> >> are incredibly rich these days.
>> >
>> > Please be more specific here. What is this all about?
>> 
>> We got many bug reports over the years about sandboxes and other heavy
>> users of namespaces and clone that the glibc PID cache got out of
>> sync, both in child and parent (!) processes.
>
> have any links?

<https://bugs.chromium.org/p/chromium/issues/detail?id=484870>

You guys ran into this as well and wrote a raw_getpid function which
calls the system call.  (You should have reported the bug instead.)

>> > What triggered this specifically? is this about docker? docker is
>> > written in golang anyway, iirc, which doesn't bother with linking to
>> > libc anyway?
>> 
>> It needs glibc for access to the host and user databases.
>
> can you elaborate? I fail to see any relationship between
> unshare()/fork()/getpid() and NSS?

You asked why docker links against glibc.

>> > Is this a glibc upstream choice primarily? Were the regressions this
>> > causes considered?
>> 
>> I raised the problem of applications calling getpid frequently and
>> named OpenSSL as an example.
>
> Link?

See the collection of links in the other message.

> And I am pretty sure the usecase is very valid... And yes,
> even if checking getpid() misses some theoretical corner cases,
> pthread_atfork() or whatever else you propose will miss others too,
> and is much uglier codewise, introduces deps, yadda yadda...

We actually increased the accuracy of your fork detection logic (even
though it's still broken), so I'm puzzled why you keep calling this a
regression.


More information about the systemd-devel mailing list