[systemd-devel] Cannot use systemctl after heavy swapping

Wed Jan 7 07:59:50 PST 2015

Hello!

I seem to have reproduced this issue. After a lot of swapping, systemd 
appeared to have become stuck. Trying to restart services with systemctl 
blocked indefinitely. Strangely, this seemed to be the case even after a 
reboot.

Here is a part of the strace -p 1

recvmsg(16, 0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) 
= -1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, 
{u32=3793072544, u64=140341849469344}}}, 29, 0) = 
1clock_gettime(CLOCK_BOOTTIME, {863156, 624419539}) = 0recvmsg(16, 
0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 
-1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, 
{u32=3793072544, u64=140341849469344}}}, 29, 0) = 
1clock_gettime(CLOCK_BOOTTIME, {863156, 624668458}) = 0recvmsg(16, 
0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 
-1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, 
{u32=3793072544, u64=140341849469344}}}, 29, 0) = 
1clock_gettime(CLOCK_BOOTTIME, {863156, 624919333}) = 0recvmsg(16, 
0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 
-1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, 
{u32=3793072544, u64=140341849469344}}}, 29, 0) = 
1clock_gettime(CLOCK_BOOTTIME, {863156, 625167344}) = 0recvmsg(16, 
0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 
-1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, 
{u32=3793072544, u64=140341849469344}}}, 29, 0) = 
1clock_gettime(CLOCK_BOOTTIME, {863156, 625417381}) = 0recvmsg(16, 
0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 
-1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, 
{u32=3793072544, u64=140341849469344}}}, 29, 0) = 
1clock_gettime(CLOCK_BOOTTIME, {863156, 625665881}) = 0

systemd --version prints

systemd 215
+PAM +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ 
-SECCOMP -APPARMOR

After a second reboot, the problem seems to have disappeared.

-Alan

On 12/02/2014 05:17 PM, Lennart Poettering wrote:
> On Fri, 14.11.14 15:20, Jan Janssen (medhefgo at web.de) wrote:
>
>> Hi,
>>
>> I think there might be something wrong with how the rate limiting works in
>> manager.c. Just recently, firefox went nuts and got the whole system
>> swapping like crazy. After manual OOM killing, the system is back to normal,
>> but I can't seem to do any service management with systemctl afterwards.
>>
>> A simple "sudo systemctl start systemd-timedated.service" will hang forever.
>> While the journal keeps getting this message about every second:
>>      systemd[1]: Looping too fast. Throttling execution a little.
>> while other systemctl actions tend to time out (status, for example).
>>
>> Interestingly, if I don't use sudo (and instead rely on polkit), everything
>> seems to work as expected and I can get things started.
>>
>> This is all on systemd 217 on up-to-date Arch.
> Hmm, the "looping too fast" msg is usually triggerd by systemd for
> some reason entering a busy loop. Which is bug we really should track
> down and fix. Any chance you can use "strace -p 1" when this happens
> to see what PID 1 is spinning on there? If in doubt please attach a
> fragment here.
>
> Thanks,
>
> Lennart
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20150107/38ccb86d/attachment-0001.html>