[systemd-devel] sharing of D-Bus connection between systemd PAM modules causes problems
Norbert Braun
norbert at xrpbot.org
Tue Apr 11 00:01:22 UTC 2023
Hi all,
I recently ran into a problem on Arch Linux ARM (32 bit) where logging
in as root on the console would often, but not always, fail (much like
in https://github.com/systemd/systemd/issues/17266). While investigating
the problem, I found the following:
Systemd ships with two PAM modules, pam_systemd.so and
pam_systemd_home.so. Both of these use pam_acquire_bus_connection to
open a connection to the system bus. pam_acquire_bus_connection opens a
connection on the first call, then uses pam_set_data and pam_get_data to
cache the connection object for subsequent calls. Since the namespace
for pam_set_data/pam_get_data is shared between all PAM modules, it can
happen that one PAM module opens the connection and another one uses it.
In my case, pam_systemd_home.so opens the connection and sends the Hello
message. If the root user attempts to log in, pam_systemd_home.so exits
early and leaves the connection open, to be re-used by pam_systemd.so.
This is problematic because struct sd_bus contains OrderedHashmap
*reply_callbacks, and OrderedHashmap internally uses a global variable
shared_hash_key. The PAM modules are statically linked with libsystemd,
so this variable effectively exists twice in each of the two PAM
modules. Since it is initialized to a random value, the value differs
between the PAM modules. In the scenario above, it therefore differs
between the sending of the Hello message and the processing of the
reply. Thus, when the reply to the Hello message arrives, process_reply
effectively looks for the reply cookie in a random hash bucket, and may
or may not find it. In the latter case, this eventually leads to the
somewhat cryptic error message: "pam_systemd(login:session): Failed to
create session: Input/output error".
The problem is hidden on 64 bit systems, because the sizes of struct
ordered_hashmap_entry and struct indirect_storage are such that an
OrderedHashmap with direct storage only has a single bucket, and the
value of shared_hash_key is therefore irrelevant. On a 32 bit system,
however, the sizes are such that there are two buckets, and root login
fails, with 50% probability, with the error message mentioned above. As
expected from the above, it is possible to cause the problem to appear
on a 64 bit system by changing "uint8_t _pad[3];" to "uint8_t _pad[19];"
in struct indirect_storage (in src/basic/hashmap.c).
After the above, another problem surfaces during cleanup: bus_free calls
ordered_hashmap_free_free(b->reply_callbacks), which calls free on each
value in the hashmap. However, the struct reply_callback that
sd_bus_call_async puts into the hashmap was not individually allocated,
but part of a larger struct sd_bus_slot. free is unhappy about that, and
the login process finally dies with a segmentation fault, aborting the
login attempt entirely. This problem is normally hidden by the fact that
reply_callbacks is empty by the time that bus_free is called.
Best regards,
Norbert
More information about the systemd-devel
mailing list