[systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

Michael Chapman mike at very.puzzling.org
Tue Jul 11 11:59:45 UTC 2017


On Tue, 11 Jul 2017, Lennart Poettering wrote:
> On Tue, 11.07.17 12:55, Uoti Urpala (uoti.urpala at pp1.inet.fi) wrote:
>
>> On Tue, 2017-07-11 at 09:35 +0200, Lennart Poettering wrote:
>>> Normally it's dead cheap to check that, it's just reading and
>>> comparing one memory location. It's a pitty that this isn't the case
>>> currently on Debian, but as it appears this is an oversight on their
>>> side, and I am sure it will be eventually fixed there, if it hasn't
>>> already.
>>
>> Are you sure about those "Debian only" and "will be 'fixed'" parts? The
>> Debian patch seems to be a cherry pick from upstream glibc. Is there
>> evidence of some error that would cause effects only visible on Debian
>> and nowhere else? And/or has the change been reverted or behavior
>> otherwise modified upstream to limit the range of relevant versions?
>
> See the links Vito provided:
>
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d
>
> i.e. Debian undid the PID caching to fix some issue that has been fix
> properly now, and hence the PID caching should be turned on again.
>
> On Fedora at least getpid() is not visible in strace, and is fully
> cached, as it should be.

I just tested this on F25 and F26 beta, and it's certainly visible for me 
on both of them:

   # cat /etc/system-release
   Fedora release 25 (Twenty Five)
   # rpm -q glibc
   glibc-2.24-9.fc25.x86_64
   # strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
   % time     seconds  usecs/call     calls    errors syscall
   ------ ----------- ----------- --------- --------- ----------------
    93.93    0.167020           2     83761           getpid
     3.93    0.006983           2      3025           write
     0.54    0.000953          10        97           mmap
     0.39    0.000696          13        52         8 open
     0.31    0.000558          14        40           munmap
     0.19    0.000332           8        42           mprotect
     0.15    0.000264           6        45           fstat
     0.14    0.000246           6        44           close

   # cat /etc/system-release
   Fedora release 26 (Twenty Six)
   # rpm -q glibc
   glibc-2.25-7.fc26.x86_64
   # strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
   % time     seconds  usecs/call     calls    errors syscall
   ------ ----------- ----------- --------- --------- ----------------
    62.84    0.007874           4      2063           getpid
     7.96    0.000998          12        86           mmap
     7.85    0.000983          20        48         8 open
     3.85    0.000483           9        54           mprotect
     2.80    0.000351           5        71           write
     2.75    0.000345          11        32           read
     2.69    0.000337           8        41           fstat
     2.66    0.000333           8        40           close

The second machine had just been started, which is why the numbers are a 
lot lower. Nevertheless, getpid is still taking by far the most 
amount of time in syscalls.

Both of these are on Fedora's testing branch, but I don't think Fedora's 
regular branch has a significantly different version of glibc.


More information about the systemd-devel mailing list