[systemd-devel] systemd's connections to /run/systemd/private ?

Thu Aug 15 10:36:07 UTC 2019

On Thu, 15 Aug 2019, Lennart Poettering wrote:
> On Mi, 14.08.19 22:36, Michael Chapman (mike at very.puzzling.org) wrote:
> 
> > On Wed, 14 Aug 2019, Lennart Poettering wrote:
> > > Well, a D-Bus connection can remain open indefinitely, and may even
> > > have incomplete "half" messages queued in them as long as the client
> > > desires. After the initial authentication is done, clients may thus
> > > take up resources as long as they want, this is by design of dbus
> > > really, and is different from HTTP for example, where connections
> > > usually have a time-out applied. dbus doesn't know timeouts for
> > > established connections. It knows them for the authentication phase,
> > > and it knows them for method calls that are flight, but it does not
> > > know them for the mere existance of an established connection.
> >
> > I'm sure it's not in the design of DBus that clients can continue to
> > consume those resources after they've disconnected.
> >
> > > PID 1 authenticates clients of the private connection simply by making
> > > the socket for it inaccessible to anyone who is not privileged. Due to
> > > that it gets away with not doing any further per-user accounting,
> > > because it knows the clients are all privileged anyway.
> > >
> > > So, yes, it would be good if we could protect us from any form of
> > > misuse, but basically, if you have a root client that misbehaves you
> > > have too accept that...
> >
> > I understand all that. Nevertheless, Brian identified a bug: after
> > receiving certain data on its private socket, the systemd process can leak
> > a file descriptor.
> 
> Can it? Did I miss something? If the client closes the client side of
> the socket, but PID 1 would keep the server side of it open anyway,
> then that would be a bug indeed. But my understanding was that the
> client side stays pinned?

I was able to reproduce the bug on CentOS 7's systemd 219. That is, the 
file descriptor in PID 1 was dropped from its epoll set without it 
reaching EOF and without it being closed. Every time I ran Brian's command 
PID 1 would leak another file descriptor.

I was unable to reproduce this on a later version of systemd, but that 
_could_ just be because this later version of systemd side-steps the issue 
by ensuring that systemctl doesn't use fd 1 for its socket.

I have some reason to believe the problem in PID 1 has been fixed though. 
On CentOS 7 I was able to cause it to sometimes leak an fd simply by 
sending random data to it:

  # count-sockets() { ss -x | grep /run/systemd/private | wc -l; }
  # inject-junk() { timeout 1s nc -U /run/systemd/private </dev/urandom; (( $? == 124 )) && echo Timed out; }
  # while true; do count-sockets; inject-junk; done
  0
  Ncat: Connection reset by peer.
  0
  Ncat: Connection reset by peer.
  ...
  0
  Timed out
  1
  Ncat: Connection reset by peer.
  1
  Ncat: Connection reset by peer.
  ...
  2
  Timed out
  3
  Ncat: Connection reset by peer.
  3
  Ncat: Connection reset by peer.
  ...

With systemd 239 I was unable to cause an fd leak this way.

Still, I would feel more comfortable if I could find a commit that 
definitely fixed the problem. All of these experiments are just 
circumstantial evidence.