[systemd-devel] sharing of D-Bus connection between systemd PAM modules causes problems

Norbert Braun norbert at xrpbot.org
Tue Apr 11 00:01:22 UTC 2023


Hi all,

I recently ran into a problem on Arch Linux ARM (32 bit) where logging 
in as root on the console would often, but not always, fail (much like 
in https://github.com/systemd/systemd/issues/17266). While investigating 
the problem, I found the following:

Systemd ships with two PAM modules, pam_systemd.so and 
pam_systemd_home.so. Both of these use pam_acquire_bus_connection to 
open a connection to the system bus. pam_acquire_bus_connection opens a 
connection on the first call, then uses pam_set_data and pam_get_data to 
cache the connection object for subsequent calls. Since the namespace 
for pam_set_data/pam_get_data is shared between all PAM modules, it can 
happen that one PAM module opens the connection and another one uses it. 
In my case, pam_systemd_home.so opens the connection and sends the Hello 
message. If the root user attempts to log in, pam_systemd_home.so exits 
early and leaves the connection open, to be re-used by pam_systemd.so.

This is problematic because struct sd_bus contains OrderedHashmap 
*reply_callbacks, and OrderedHashmap internally uses a global variable 
shared_hash_key. The PAM modules are statically linked with libsystemd, 
so this variable effectively exists twice in each of the two PAM 
modules. Since it is initialized to a random value, the value differs 
between the PAM modules. In the scenario above, it therefore differs 
between the sending of the Hello message and the processing of the 
reply. Thus, when the reply to the Hello message arrives, process_reply 
effectively looks for the reply cookie in a random hash bucket, and may 
or may not find it. In the latter case, this eventually leads to the 
somewhat cryptic error message: "pam_systemd(login:session): Failed to 
create session: Input/output error".

The problem is hidden on 64 bit systems, because the sizes of struct 
ordered_hashmap_entry and struct indirect_storage are such that an 
OrderedHashmap with direct storage only has a single bucket, and the 
value of shared_hash_key is therefore irrelevant. On a 32 bit system, 
however, the sizes are such that there are two buckets, and root login 
fails, with 50% probability, with the error message mentioned above. As 
expected from the above, it is possible to cause the problem to appear 
on a 64 bit system by changing "uint8_t _pad[3];" to "uint8_t _pad[19];" 
in struct indirect_storage (in src/basic/hashmap.c).

After the above, another problem surfaces during cleanup: bus_free calls 
ordered_hashmap_free_free(b->reply_callbacks), which calls free on each 
value in the hashmap. However, the struct reply_callback that 
sd_bus_call_async puts into the hashmap was not individually allocated, 
but part of a larger struct sd_bus_slot. free is unhappy about that, and 
the login process finally dies with a segmentation fault, aborting the 
login attempt entirely. This problem is normally hidden by the fact that 
reply_callbacks is empty by the time that bus_free is called.

Best regards,

Norbert


More information about the systemd-devel mailing list