[systemd-devel] Randomly on shutdown, stop timeout for user at .service (repeated report, different user)

Toms Seisums toms.seisums at gmail.com
Fri Oct 4 09:37:10 PDT 2013


Next reboot after the user at 0.service start/stop cycle was a quick one. It
looks as something gets messed up upon logging in. And if not resolved, the
system has problems shutting down.


On 4 October 2013 19:31, Toms Seisums <toms.seisums at gmail.com> wrote:

> @Andrey, the full log can be found here, the link was there before also:
> http://pastebin.com/wbr04AQw (systemd-207)
> It's quite silent about the cause for the timeout, though.
>
> I've been debugging this for three days now, and I've crystallized a
> little bit of information.
>
> 1. If I boot the system and stay on login screen, no matter when I hit
> Ctrl+Alt+Del (ASAP or after an hour), the system will always reboot
> normally.
> 2. If I login, and instantly type `reboot`, in about 75% the system will
> reboot normally, the other 25% being slow, resulting in user at 0.servicetimeout.
> 2.1. If I login and wait for some time, the next reboot is going to be
> slow.
> 3. I'm using Arch, and they had a bug report already there -
> https://bugs.archlinux.org/task/37007, it has come down to that
> systemd-208 fixes it, though, I've updated and the problem still remains.
> 4. I've attempted numerous daemon configurations, but the results are
> inconsistent. Due to that Arch issue being closed and not reopened, I did
> this, in order to find out whether running daemons aren't at fault.
> 4.1. I, for some reboot's disabled every daemon (not also running them
> after the system booted up), and the reboots/shutdowns were still slow.
> 4.1.1. The Ctrl+Alt+Del before logging in, is still fast.
> 5. Downgrading to systemd-204 works flawlessly, also removing the issue
> that's described in next point.
> 6. Whenever I boot with systemd-207 or systemd-208 and login, I am
> presented with:
>
>     Failed to open private bus connection: Failed to connect to socket
> /run/user/0/dbus/user_bus_socket: No such file or directory
>
> 6.1. The issue appears to already be reported by @Andrey - here:
> http://lists.freedesktop.org/archives/systemd-devel/2013-September/013393.html
> 7. Apparently, I specially set up a system running in VBox, also an Arch
> Linux, that didn't have the problem neither running with systemd-207,
> neither has it with systemd-208 (that actually led me to disable the
> daemons).
> 7.1. I did encounter it once though, but since then, it hasn't happened.
> 8. If, before running shutdown, I manually: `systemctl stop user at 0.service`,
> the system shuts down prefectly.
>
> Here are some logs of shutdown with systemd-208:
>
>  1. Quick, Ctrl+Alt+Del before logging in. (1:
> http://pastebin.com/5X0sUHCV)
>  2. Slow, executed a little later after logging in. (2:
> http://pastebin.com/tjnLNJKH)
>  3. Slow, executed by Ctrl+Alt+Del after logging in, doing some tasks,
> then `logout`. (3: http://pastebin.com/4hEmRYen)
>  4. I wanted to catch a fast reboot with a full login and full daemons
> running, but couldn't manage to - they all were slow.
>  5. It also happens if I log in with a different user than root and push
> Ctrl+Alt+Del. (4: http://pastebin.com/U36WqDzP)
>  5.1. This is just a test for you @Lennart, just to see if a different
> user changes anything.
>  6. `systemctl stop user at 0.service` to `reboot`, that results in a quick
> one. (5: http://pastebin.com/aJ0s32MR) (I, for a second thought that
> Apache might be causing some issues, because it almost always gets killed,
> so, I disabled it for some tests, this log has no Apache information -
> that's just in case you're wondering why the logs don't match. More on
> Apache adventures below...)
>  6.1. I reran it after Apache tests to clarify, quick reboot the first
> time (9: http://pastebin.com/HcUA72AE). The second time. (10:
> http://pastebin.com/rf96QiGA) The third time. (11:
> http://pastebin.com/bpFu1sQK) All in a row. I did 2 more to test for the
> next point, all quick.
>  6.2. This doesn't seem to change the SFTP behavior, though.
>
> An extra piece of information I can provide, is that, this appears to
> happen to break SFTP. If I boot, then connect from my other PC with SFTP
> (as root), and initiate a reboot from tty0, my SFTP connection will not be
> dropped. It will actually never even timeout. I could restart my Linux
> machine for about 100times, and the SFTP window will not even blink,
> thinking that it's connected. Only when I refresh the SFTP window, I get an
> error message, asking me to reconnect. With systemd-204 there is no such an
> issue - it disconnects immediately. Though, on my VBox Arch, there wasn't
> such an issue with systemd-207, nor with systemd-208.
>
>
> And @Lennart, you asked, I delivered, here is a log with user at 0.servicestart/stop with 0.5second intervals between start/stop calls. I stripped
> the booting sequence, starting the log off at about where the system ended
> it's boot. (12: http://sharetext.org/zJc, the link is different because I
> exceeded pastebins free limit, doh).
> Nothing special there, except that dbus thingy.
>
> ---
>
> Apache:
>
> So, noticing that Apache gets killed most of time, and systemd-204 also
> had some delays where it waited for Apache to exit, I thought that the
> problem is with Apache.
>
> I started this debugging scenario by simply doing a "systemctl stop httpd"
> into "reboot", that made my system to reboot quickly. (6:
> http://pastebin.com/4NgxVAeP)
>
> Though, one `httpd` instance still remains (occasionally, when I
> "systemctl stop httpd", "ps aux" immediately after - will still display one
> httpd process), that appears to actually be a bug in Apache's (I guess),
> that appears to be the one that has to be explicitly killed at the end of
> reboot process.
>
> By partially confirming, that this is because of Apache, I stopped it upon
> next boot, disabled it, and rebooted. Starting freshly with "ps aux"
> reporting no signs of httpd, I attempted to "reboot" - guess what? That's a
> long one. Three times in a row with no httpd upon boot - all reboots were
> long, I guess that's not Apache then.
>
> While writing the above and debugging, after having enabled Apache back
> and stopping it upon boot, even though previous reboots executed
> immediately, one didn't. (7: http://pastebin.com/3FnikCX9) Apparently the
> next one after it also didn't. (8: http://pastebin.com/pnWhuQWC)
>
> ---
>
> P.S. While debugging this, and trying to link some of the problems to some
> of the daemons, I'm starting to feel as if I'd hit a delusion spree. While
> occasionally it reboots fast, it might as well be just the random
> quick-reboot I've been hunting before. Just that stopping those daemons
> makes me think the problem is tied to them.
>
>
> ---
>
> Loads of information, some of it are probably total duplicates, but I hope
> I've given enough and that you'll be able to understand what I've provided.
> More importantly, I hope that this will help to resolve these issues.
>
>
> On 1 October 2013 03:43, Lennart Poettering <lennart at poettering.net>wrote:
>
>> On Mon, 30.09.13 18:55, Toms Seisums (toms.seisums at gmail.com) wrote:
>>
>> > It appears that history repeats itself.
>> >
>> > Once (July 2013) a similar issue was reported here, by different user:
>> >
>> http://lists.freedesktop.org/archives/systemd-devel/2013-July/012283.html
>> >
>> > I'm also using Arch Linux, systemd-207, linux-3.11.2.
>> >
>> > On shutdown and reboot, user at 0.service occasionally freezes my system
>> until
>> > it times out and gets killed.
>> >
>> > Steps to reproduce:
>> > 1) Boot
>> > 2) Login
>> > 3) Execute either systemctl reboot or reboot or shutdown -h now or
>> whatever.
>> >
>> > I've went through the steps here, to get a proper shutdown log:
>> >
>> http://freedesktop.org/wiki/Software/systemd/Debugging/#shutdowncompleteseventually
>> >
>> > And, the shutdown log can be found here: http://pastebin.com/wbr04AQw
>> >
>> > The actual lines related to issue:
>> >
>> > #2223 [   74.387130] systemd[1]: Got D-Bus request:
>> > org.freedesktop.DBus.Local.Disconnected() on /org/freedesktop/DBus/Local
>> > #2224 [  163.814925] systemd[1]: user at 0.service stopping timed out.
>> Killing.
>> > #2225 [  163.815447] systemd[1]: user at 0.service changed stop-sigterm ->
>> > stop-sigkill
>> > #2226 [  163.816063] systemd[1]: Received SIGCHLD from PID 326
>> (systemd).
>> > #2227 [  163.816093] systemd[1]: Got SIGCHLD for process 326 (systemd)
>> > #2228 [  163.816153] systemd[1]: Child 326 died (code=killed,
>> status=9/KILL)
>> > #2229 [  163.816158] systemd[1]: Child 326 belongs to user at 0.service
>> > #2230 [  163.816171] systemd[1]: user at 0.service: main process exited,
>> > code=killed, status=9/KILL
>> > #2231 [  163.816183] systemd[1]: user at 0.service changed stop-sigkill ->
>> > failed
>> >
>> > I currently haven't found out a pattern for this. I have attempted to
>> quit
>> > some processes before initiating the shutdown/reboot, but the result is
>> > random.
>> >
>> > About 95% of shutdowns/reboots it will be a long one.
>> > 3% it will be noticeably longer than the quickest one, but shorter than
>> the
>> > above one.
>> > Remaining 2%, the shutdown will be very quick.
>>
>> Hmm, this is the systemd user instance for root which fails to shut
>> down. Can you reproduce this if you start/stop "user at 0.service" quickly
>> in a loop?
>>
>> Lennart
>>
>> --
>> Lennart Poettering - Red Hat, Inc.
>>
>
>
>
> --
> Toms Seisums
>
> - (+371) 26431391
>
> May the force be with you, always!
>



-- 
Toms Seisums

- (+371) 26431391

May the force be with you, always!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/systemd-devel/attachments/20131004/a03c9c2d/attachment-0001.html>


More information about the systemd-devel mailing list