[systemd-devel] Requested transaction contradicts existing jobs: start is destructive

Wed May 6 17:49:02 UTC 2020

On Wed, 6 May 2020 at 15:58, Mark Bannister <mbannister at janestreet.com> wrote:
>
> I repeated the exact same test with a different user account, one
> set-up specifically for this test and nothing else, so there was only
> one SSH session.  But 'systemctl list-jobs' always reported 'No jobs
> running' even after the SSH session was terminated.  My script kept
> running with the parent PID of 1.  Thanks for your help so far, but I
> have still been unable to reproduce the problem.  What am I doing
> wrong?
>

Alright, so I dug into this, and managed to reproduce it at last.

This issue is NOT reproducible with the version you reported
originally. The stock CentOS image on my cloud provider comes with the
same version (219-67.el7_7.4) that you said you see this issue with,
but that can't be, as it doesn't include this change:
https://github.com/systemd-rhel/rhel-7/commit/0600681f04e3818282a2d518ec3e6afee85f7978,
unless I'm missing something. It took me a while to figure this out...

This was added in 219-69 (see the tag). I guess you built your copy
with some custom patches, which pulled it in. This is the reason
essentially why the transaction becomes destructive. Prior to this
change, the session scope added a Wants= dependency, instead of
Requires=, on user-UID.slice, so the conflicting start job could
simply be *dropped* instead of failing the transaction on a login
attempt, but the change upgrades Slice= deps to Requires=, so they
start failing when stop is waiting and start is included in a newly
generated transaction. You can verify this with systemctl show
session-N.scope, the 219-67 says Wants=user-UID.slice, and 219-69 and
upwards say Requires=user-UID.slice.

One key thing I missed was that systemd on CentOS wouldn't clean up
the session for the user since KillUserProcesses=no by default, and
also the fact that the user slice's stop job is not added until all
session scopes finish cleaning up. So the scope is simply abandoned,
and should exist until all processes die themselves.

So, observing this issue is not easy, it would only trigger if you try
to login right when systemd tries to stop the user slice and the job
exists in queue (which is a very short window), and it took a lot of
tries before I could reproduce it (for a start I manually queued the
stop job for the slice with a hung session scope and tried logging in
to verify the transaction problem).

So this is racy in nature, the sequence of events can be that your
processes die, session scopes are gone, and now the manager will try
to shut down the user slice, and if you happen to be amidst a login at
this moment, pam_systemd's call for CreateSession would conflict with
the already waiting stop job. It doesn't matter if KillUserProcesses=
is set or not, that just decides if the scopes are cleaned up on
logout (they would become inactive when all session processes are gone
anyway). It does decide when the user-UID.slice would be stopped,
however.

The reason you didn't see any jobs was that the scope was abandoned,
and your long-lived process pinned it. So the session scope should
never stop, as a consequence neither the slice.

Here are a series of steps to widen the race (by queuing a stop job
for user slice)  and reproduce it:
1. Create a script badproc.sh; chmod +x badproc.sh
#!/bin/bash
trap "" SIGHUP
trap "" SIGINT
trap "" SIGTERM
exec "$@"

2. From user's shell, do: ./badproc.sh sleep infinity &; disown; ^D (logout)
The scope is now abandoned, or being stopped (if KillUserProcesses=yes)
3. From root's shell, do:
systemctl --no-block stop session-N.scope; systemctl --no-block stop
user-UID.slice;
systemctl list-jobs should list both jobs, the scope job as running,
and the slice job as waiting.

The sequence that actually happens is processes dying, (last) session
scope stopping, user slice stopping, in order. The point at which it
fails is the user slice stopping, which is a very very short time
window, so it makes sense if you only see these sporadically.

4. Try a login, which gives me:
May 06 17:03:12 systemd[1]: Requested transaction contradicts existing
jobs: Transaction is destructive.
May 06 17:03:12 sshd[3134]: pam_systemd(sshd:session): Failed to
create session: Resource deadlock avoided
May 06 17:03:12 systemd-logind[3078]: Failed to start session scope
session-37.scope: Transaction is destructive.

All in all, you can't do much about these errors, they're mostly
harmless (at worst you need to retry the login), and upgrading to v228
(the patchset mentioned in the previous exchange) or higher should fix
all of this for you.

--
Kartikeya