[systemd-devel] Query on sshd.socket sshd.service approaches

Wed Mar 6 15:34:27 UTC 2024

On Mi, 06.03.24 13:06, Arseny Maslennikov (ar at cs.msu.ru) wrote:

> > The question of course is how many SSH instances you serve every
> > minute. My educated guess is that most SSH installations have a use
> > pattern that's more on the "sporadic use" side of things. There are
> > certainly heavy use scenarios though (e.g. let's say you are github
> > and server git via sshd).
>
> A more relevant source of problems here IMO is not the "fair use"
> pattern, but the misuse pattern.
>
> The per-connection template unit mode, unfortunately, is really unfit
> for any machine with ssh daemons exposed to the IPv4 internet: within
> several months of operation such a machine starts getting at least 3-5
> unauthed connections a second from hierarchically and geographically
> distributed sources. Those clients are probing for vulnerabilities and
> dictionary passwords, they are doomed to never be authenticated on a
> reasonable system, so this is junk traffic at the end of the day.
>
> If sshd is deployed the classic way (№1 or №3), each junk connection is
> accepted and possibly rate-limited by the sshd program itself, and the
> pid1-manager's state is unaffected. Units are only created for
> authorized connections via PAM hooks in the "session stack";
> same goes for other accounting entities and resources.
> If sshd is deployed the per-connection unit way (№2), each junk connection will
> fiddle with system manager state, IOW make the machine create and
> immediately destroy a unit: fork-exec, accounting and sandboxing setup
> costs, etc. If the instance units for junk connections are not
> automatically collected (e. g. via `CollectMode=inactive-or-failed`
> property), this leads to unlimited memory use for pid1 on an unattended
> machine (really bad), powered by external actors.

Well, whatever sshd does as ratelimiting systemd can do to
afaics. I.e. the sshd at .service definition we suggest that and that the
big distros use all get the ExecStart=- thing right, so that an
unclean exit of sshd does not result in a pinned unit. Moreover, there's
PollLimitIntervalSec=/PollLimitBurst=, MaxConnectionsPerSource=,
MaxConnections= that ensures that any attempt to flood the socket
is reasonably contained, and the system recovers from that.

Current versions of systemd enable these settings by default, hence I
think we actually should be fine by default, even if you do not tune
these .socket parameters.

> > I'd suggest to distros to default to mode
> > 2, and alternatively support mode 3 if possible (and mode 1 if they
> > don#t want to patch the support for mode 3 in)
>
> So mode 2 only really makes sense for deployments which are only ever
> accessible from intranets with little junk traffic.

What precisely do you think is missing in systemd that
PollLimitIntervalSec=/PollLimitBurst=, MaxConnectionsPerSource=,
MaxConnections= can't cover?

Lennart

--
Lennart Poettering, Berlin