[systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

Zbigniew Jędrzejewski-Szmek zbyszek at in.waw.pl
Tue Jul 11 12:37:06 UTC 2017


On Tue, Jul 11, 2017 at 09:59:45PM +1000, Michael Chapman wrote:
> On Tue, 11 Jul 2017, Lennart Poettering wrote:
> >On Tue, 11.07.17 12:55, Uoti Urpala (uoti.urpala at pp1.inet.fi) wrote:
> >
> >>On Tue, 2017-07-11 at 09:35 +0200, Lennart Poettering wrote:
> >>>Normally it's dead cheap to check that, it's just reading and
> >>>comparing one memory location. It's a pitty that this isn't the case
> >>>currently on Debian, but as it appears this is an oversight on their
> >>>side, and I am sure it will be eventually fixed there, if it hasn't
> >>>already.
> >>
> >>Are you sure about those "Debian only" and "will be 'fixed'" parts? The
> >>Debian patch seems to be a cherry pick from upstream glibc. Is there
> >>evidence of some error that would cause effects only visible on Debian
> >>and nowhere else? And/or has the change been reverted or behavior
> >>otherwise modified upstream to limit the range of relevant versions?
> >
> >See the links Vito provided:
> >
> >https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
> >https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d
> >
> >i.e. Debian undid the PID caching to fix some issue that has been fix
> >properly now, and hence the PID caching should be turned on again.
> >
> >On Fedora at least getpid() is not visible in strace, and is fully
> >cached, as it should be.
> 
> I just tested this on F25 and F26 beta, and it's certainly visible
> for me on both of them:
> 
>   # cat /etc/system-release
>   Fedora release 25 (Twenty Five)
>   # rpm -q glibc
>   glibc-2.24-9.fc25.x86_64
>   # strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
>   % time     seconds  usecs/call     calls    errors syscall
>   ------ ----------- ----------- --------- --------- ----------------
>    93.93    0.167020           2     83761           getpid
>     3.93    0.006983           2      3025           write
>     0.54    0.000953          10        97           mmap
>     0.39    0.000696          13        52         8 open
>     0.31    0.000558          14        40           munmap
>     0.19    0.000332           8        42           mprotect
>     0.15    0.000264           6        45           fstat
>     0.14    0.000246           6        44           close
> 
>   # cat /etc/system-release
>   Fedora release 26 (Twenty Six)
>   # rpm -q glibc
>   glibc-2.25-7.fc26.x86_64
>   # strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
>   % time     seconds  usecs/call     calls    errors syscall
>   ------ ----------- ----------- --------- --------- ----------------
>    62.84    0.007874           4      2063           getpid
>     7.96    0.000998          12        86           mmap
>     7.85    0.000983          20        48         8 open
>     3.85    0.000483           9        54           mprotect
>     2.80    0.000351           5        71           write
>     2.75    0.000345          11        32           read
>     2.69    0.000337           8        41           fstat
>     2.66    0.000333           8        40           close
> 
> The second machine had just been started, which is why the numbers
> are a lot lower. Nevertheless, getpid is still taking by far the
> most amount of time in syscalls.
> 
> Both of these are on Fedora's testing branch, but I don't think
> Fedora's regular branch has a significantly different version of
> glibc.

Yep, I can confirm that: many many getpid() syscalls on F26 and rawhide,
and none on F24.

Zbyszek


More information about the systemd-devel mailing list