[systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

vcaputo at pengaru.com vcaputo at pengaru.com
Mon Jul 10 21:04:44 UTC 2017


On Sat, Jul 08, 2017 at 03:49:11AM +0000, Zbigniew Jędrzejewski-Szmek wrote:
> On Fri, Jul 07, 2017 at 03:54:09PM -0700, vcaputo at pengaru.com wrote:
> > On Fri, Jul 07, 2017 at 10:34:22PM +0000, Zbigniew Jędrzejewski-Szmek wrote:
> > > On Fri, Jul 07, 2017 at 02:35:16PM -0700, vcaputo at pengaru.com wrote:
> > > > On Fri, Jul 07, 2017 at 01:49:54PM -0700, vcaputo at pengaru.com wrote:
> > > > > On Fri, Jul 07, 2017 at 08:37:08PM +0000, Mantas Mikulėnas wrote:
> > > > > > Back when that commit was made, didn't glibc cache the getpid() result in
> > > > > > userspace? That would explain why it was not noticed.
> > > > >
> > > > > Hmm, this crossed my mind, and come to think of it I did a dist-upgrade
> > > > > from Debian jessie to stretch overnight machine and haven't rebooted.
> > > > > 
> > > > > Perhaps the vdso isn't working and the costly getpid() is a red herring, will
> > > > > reboot and retest to confirm.
> > > > > 
> > > > 
> > > > It appears Debian has a glibc patch to disable the caching (I was unaware
> > > > such an elaborate dance was being performed to cache this!)
> > > > 
> > > > https://anonscm.debian.org/cgit/pkg-glibc/glibc.git/commit/debian/patches/any?id=5850253f509604dd46a6131acc057ea26e1588ba
> > > 
> > > Do we know the justification for this patch?
> > > 
> > 
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
> > 
> > Which references this upstream glibc bug:
> > 
> > https://sourceware.org/bugzilla/show_bug.cgi?id=19957
> > https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d
> > 
> > 
> > > > Unsure where I stand on core system software assuming certain syscalls are
> > > > always going to be exceptionally cheap though...
> > > 
> > > Optimization is never in a vacuum. If glibc does something cheaply, it
> > > seems reasonable to take advantage of it.
> > > 
> > 
> > Except there's always a risk of these things regressing to normal syscalls,
> > and one has to weigh the utility against that.  It's unclear to me what
> > significant utility having the sd-journal API police changing pids by
> > calling getpid() at every public entrypoint is bringing to the table.
> 
> So it seems the issue has been fixed in glibc upstream more than a year
> ago, and it doesn't seem to make sense to optimize current systemd git for
> that.
> 

Can you provide a commit id?  I took a glance at sourcewaire.org/git/gitweb.cgi
for getpid commits and didn't see anything relevant since the removal[1].


> I see the argument that the getpid() checks are a bit excessive. Is their
> overhead actually noticeable with current glibc?
> 

On my spare arch system I still see gratuitous getpid() calls from
journalctl, which is on glibc 2.5-2.

The pollution of strace output alone due to these checks is nuisance enough
for me to want the checks removed, considering their only value is to catch
programmer errors.  There's an abundance of potential programmer errors
we're not making any effort to prevent, why is this one so privileged that
it warrants policing?

I appreciate Lennart's point about the hazards of forking from threaded
programs.  It just doesn't seem like a valid rationalization for sprinkling
a system library's entrypoints with getpid() calls to catch this in
production.

Considering the associated potential costs, and the historic controversy
surrounding the caching of this particular syscall[2] I'm a bit confused by
the status quo.

Cheers,
Vito Caputo

1: https://sourceware.org/git/gitweb.cgi?p=glibc.git&a=search&h=HEAD&st=commit&s=getpid
2: http://yarchive.net/comp/linux/getpid_caching.html



More information about the systemd-devel mailing list