[systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

Lennart Poettering lennart at poettering.net
Tue Jul 11 14:07:11 UTC 2017


On Tue, 11.07.17 21:59, Michael Chapman (mike at very.puzzling.org) wrote:

> On Tue, 11 Jul 2017, Lennart Poettering wrote:
> > On Tue, 11.07.17 12:55, Uoti Urpala (uoti.urpala at pp1.inet.fi) wrote:
> > 
> > > On Tue, 2017-07-11 at 09:35 +0200, Lennart Poettering wrote:
> > > > Normally it's dead cheap to check that, it's just reading and
> > > > comparing one memory location. It's a pitty that this isn't the case
> > > > currently on Debian, but as it appears this is an oversight on their
> > > > side, and I am sure it will be eventually fixed there, if it hasn't
> > > > already.
> > > 
> > > Are you sure about those "Debian only" and "will be 'fixed'" parts? The
> > > Debian patch seems to be a cherry pick from upstream glibc. Is there
> > > evidence of some error that would cause effects only visible on Debian
> > > and nowhere else? And/or has the change been reverted or behavior
> > > otherwise modified upstream to limit the range of relevant versions?
> > 
> > See the links Vito provided:
> > 
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
> > https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d
> > 
> > i.e. Debian undid the PID caching to fix some issue that has been fix
> > properly now, and hence the PID caching should be turned on again.
> > 
> > On Fedora at least getpid() is not visible in strace, and is fully
> > cached, as it should be.
> 
> I just tested this on F25 and F26 beta, and it's certainly visible for me on
> both of them:
> 
>   # cat /etc/system-release
>   Fedora release 25 (Twenty Five)
>   # rpm -q glibc
>   glibc-2.24-9.fc25.x86_64
>   # strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
>   % time     seconds  usecs/call     calls    errors syscall
>   ------ ----------- ----------- --------- --------- ----------------
>    93.93    0.167020           2     83761           getpid
>     3.93    0.006983           2      3025           write
>     0.54    0.000953          10        97           mmap
>     0.39    0.000696          13        52         8 open
>     0.31    0.000558          14        40           munmap
>     0.19    0.000332           8        42           mprotect
>     0.15    0.000264           6        45           fstat
>     0.14    0.000246           6        44           close

Hmm, so I run a slightly older glibc, as I haven#t updated my system
in a while:

$ strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 25.95    0.001276           7       195           mmap
 23.21    0.001141           7       164        30 open
 22.29    0.001096           9       119           munmap
  6.53    0.000321           2       134           close
  6.10    0.000300           2       135           fstat
  5.15    0.000253           5        56           mprotect
  4.88    0.000240           2       102           fstatfs
  2.30    0.000113           4        32           read

getpid() is nowhere to be seen in this... Seems Fedora regressed on
this too recently. Meh.

Somebody should probably file a bug about this regression and get
clarification if this is going to remain slow, or if they are going to
fix that again... If this is going to remain slow, we shoud probably
find a different way to detect forks... we could install a
pthread_atfork() handler and set some flag...

Lennart

-- 
Lennart Poettering, Red Hat


More information about the systemd-devel mailing list