[systemd-devel] systemd's connections to /run/systemd/private ?

Brian Reichert reichert at numachi.com
Wed Jul 10 13:51:36 UTC 2019


On Wed, Jul 10, 2019 at 07:37:19AM +0000, Zbigniew J??drzejewski-Szmek wrote:

> It's a bug report as any other. Writing a meaningful reply takes time
> and effort. Lack of time is a much better explanation than ressentiments.

I wasn't expressing resentment; I apologize if it came off that way.

> Please always specify the systemd version in use. We're not all SLES
> users, and even if we were, I assume that there might be different
> package versions over time.

Quite reasonable:

  localhost:/var/tmp # cat /etc/os-release
  NAME="SLES"
  VERSION="12-SP3"
  VERSION_ID="12.3"
  PRETTY_NAME="SUSE Linux Enterprise Server 12 SP3"
  ID="sles"
  ANSI_COLOR="0;32"
  CPE_NAME="cpe:/o:suse:sles:12:sp3"

  localhost:/var/tmp # rpm -q systemd
  systemd-228-142.1.x86_64

> > When we first spin up a new SLES12 host with our custom services,
> > the number of connections to /run/systemd/private numbers in the
> > mere hundreds. 

> That sounds wrong already. Please figure out what those connections
> are. I'm afraid that you might have to do some debugging on your
> own, since this issue doesn't seem easily reproducible.

What tactics should I employ?  All of those file handles to
/run/systemd/private are owned by PID 1, and 'ss' implies there are
no peers.

'strace' in pid shows messages are flowing, but that doesn't reveal
the logic about how the connections get created or culled, nor who
initiated them.

On a box with ~500 of these file handles, I can see that many of
them are hours or days old:

  localhost:/var/tmp # date
  Wed Jul 10 09:45:01 EDT 2019

  # new ones
  localhost:/var/tmp # lsof -nP /run/systemd/private | awk '/systemd/ {
  sub(/u/, "", $4); print $4}' | (  cd /proc/1/fd; xargs ls -t --full-time ) | head -5
  lrwx------ 1 root root 64 2019-07-10 09:45:05.211722809 -0400 561 -> socket:[1183838]
  lrwx------ 1 root root 64 2019-07-10 09:40:02.611726025 -0400 559 -> socket:[1173429]
  lrwx------ 1 root root 64 2019-07-10 09:40:02.611726025 -0400 560 -> socket:[1176265]
  lrwx------ 1 root root 64 2019-07-10 09:33:10.687730403 -0400 100 -> socket:[113992]
  lrwx------ 1 root root 64 2019-07-10 09:33:10.687730403 -0400 101 -> socket:[115163]
  xargs: ls: terminated by signal 13

  # old ones
  localhost:/var/tmp # lsof -nP /run/systemd/private | awk '/systemd/ {
  sub(/u/, "", $4); print $4}' | (  cd /proc/1/fd; xargs ls -t --full-time ) | tail -5
  lrwx------ 1 root root 64 2019-07-08 15:12:04.725350882 -0400 59 -> socket:[43097]
  lrwx------ 1 root root 64 2019-07-08 15:12:04.725350882 -0400 60 -> socket:[44029]
  lrwx------ 1 root root 64 2019-07-08 15:12:04.725350882 -0400 63 -> socket:[46234]
  lrwx------ 1 root root 64 2019-07-08 15:12:04.725350882 -0400 65 -> socket:[49252]
  lrwx------ 1 root root 64 2019-07-08 15:12:04.725350882 -0400 71 -> socket:[54064]
  
> > Is my guess about CONNECTIONS_MAX's relationship to /run/systemd/private
> > correct?
> 
> Yes. The number is hardcoded because it's expected to be "large
> enough". The connection count shouldn't be more than "a few" or maybe
> a dozen at any time.

Thanks for confirming that.

> Zbyszek

-- 
Brian Reichert				<reichert at numachi.com>
BSD admin/developer at large	


More information about the systemd-devel mailing list