tdf#109085: status
Kaganski Mike
mikekaganski at hotmail.com
Mon Aug 14 10:56:09 UTC 2017
Recently I attempted to fix tdf#109085 [1]. As the attempt wasn't
successful, I think I post this report so that others could suggest
opinions or have background if they decide to work on this.
The problem (as described in the issue) is caused by improper shutdown
of LO during Windows shutdown or logoff. This doesn't close documents
properly (keeping restore information, even for files without changes),
and keeps lock file for user profile.
The first manifestation of this is that LibreOffice emits recovery
dialog on next start. This happens e.g. when user had saved all open
documents, without closing LibreOffice, then initiated logoff/shutdown.
So, the recovery dialog is unexpected, and clearly wrong.
The second problem is visible in some environments where LibreOffice
user profile may be moved across network with Windows user profile:
e.g., Roaming profiles feature of Active Directory domain [2]. Normally,
LibreOffice detects that the profile lock file was left by error
(comparing that system name and user name are the same), and simply
ignores existing lock. But in case of a roaming profile, the lock had
been created on some workstation1, where user initiated the logoff. The
LibreOffice profile (with residual lock file) gets synchronised to
server, and then to another workstation where user logs on next time.
So, started LibreOffice detects that system name is different, and
supposes that an attempt to simultaneously use the same profile from
different instances happened. It then emits a warning:
> "Either another instance of Program is accessing your personal
> settings or personal settings being locked.
> Simultaneous access can lead to inconsistencies in your personal
> settings. Before continuing, you should make sure user 'X' closes
> Program of host 'Y'
Of course, that confuses the user.
When closed normally, LibreOffice makes multiple cleanup steps both
before and after terminating program's main message loop. The specifics
of Windows shutdown/logoff sequence is that Windows does not wait
programs to close themselves; it only waits programs to process two
specific window messages: WM_QUERYENDSESSION and WM_ENDSESSION - sent to
program's all top-level windows. After the messages return result,
system is free to forcefully terminate the program at any following
moment. So, the program must be ready for that when it returns from last
WM_ENDSESSION handler.
In my research, I found that Windows does the following steps when shuts
down (tested with Win10):
1. It takes one of the program with the highest shutdown priority (see
SetProcessShutdownParameters function [3]).
1.1. It starts sending WM_QUERYENDSESSION messages to all its top-level
windows, in supposedly LIFO order. It seems that it waits for one window
to return from the message before sending it to the next window.
1.2. If a window fails to process this message in 5 seconds, Windows
will show UI telling that an app either does not respond, or waits for
user input (depending on if the app message queue status), and user
might either cancel shutdown, or continue (terminate it). In modern
Windows (Vista+), the UI is obtrusive, cover the whole screen and does
not allow to see the applications. E.g., the user will not see the
application dialog asking if the document should be saved or not, unless
user cancels the shutdown in that UI.
1.3. If one of the messages return FALSE, i.e., program denied to be
closed (e.g., user answered "Cancel" to the application request to close
a modified file), Windows will show UI telling that some app does not
allow to shutdown. Again, the UI is obtrusive, and offers to cancel or
continue (terminating the app).
1.4. When all WM_QUERYENDSESSION messages have been processed, Windows
starts to send WM_ENDSESSION messages to the windows of the same program
in the same order. The messages tell the final decision (to shutdown or
not shutdown). If all of WM_QUERYENDSESSION returned TRUE, then final
decision is to shutdown, naturally. If one of them returned FALSE, then
final decision depends on user choice ("cancel" will lead to no
shutdown, "continue" will lead to forced shutdown). Each window is given
another 5 sec time span, and if it fails to return from the handler, the
UI is shown again to user telling about hung program.
1.5. Windows ignores the return value from the WM_ENDSESSION handler.
Only the fact that handler had completed matters.
2. When all WM_ENDSESSION messages have been processed, Windows
continues with the next application of the same shutdown priority, then
to applications with lower priorities.
Currently, in LibreOffice the messages are handled in SalFrameWndProc.
This is the handler for user-visible (document or start center) windows.
Only the first called WM_QUERYENDSESSION handler does the real work. It
emits SalEvent::Shutdown, which (in ImplWindowFrameProc) calls
GetpApp()->QueryExit() (followed by Application::Quit() on success). The
QueryExit() tries to close all open frames (and this may ask user to
close and save changes), then starts shutdown steps (its last task is to
terminate main message loop). It returns false if user decided to cancel
closing a document, and this is passed as the WM_QUERYENDSESSION handler
return value. Actually, at the end of the first WM_QUERYENDSESSION
handler, we either deny shutdown, or have all documents already closed
(!), so at least there should be no recovery dialogs on next launch. The
next WM_QUERYENDSESSION handlers do nothing (naturally), as that should
be not necessary. The WM_ENDSESSION handler is only meant to reset the
machinery in case when the shutdown was successfully interrupted (to be
able to process it again in future).
Everything looks OK (well, at least for documents, if not for profile
lock), but the real life shows that it just doesn't work.
One my guess was that Windows somehow detects that the window (to which
current message was sent) was destroyed during the handler, and that
leads to process termination (cannot say that there's much sense in that
idea). I tried to re-structure the processing: only closed current
window in its handler, and delegated whole application shutdown to the
handler (in SalComWndProc) of LibreOffice's special hidden service
window. But that didn't solve the problem. Sending the messages manually
(from any utility like StefanTools' SendMessage [4]) always succeeds,
but actual tests fail again. Also, I doubt that Windows should treat
window destruction that way. And I haven't met any evidence that Windows
can terminate an application in such circumstances (even in case of >5s
processing, it only shows UI when the shutdown is not forced).
I also thought about possibility that the sequence somehow throws (or
segfaults) in LibreOffice, thus terminating the correct shutdown
sequence. But there's no notification about that on screen (or
crashreporter), and sending messages always succeed, as I mentioned. The
system didn't shutdown any hypothetical required services or resources
LibreOffice could depend on, which may be seen if LibreOffice is started
after system shutdown has been interrupted with some program with a
lower shutdown priority.
My attempts are available on gerrit [5]. I have paused my further
attempts for a while. My next try will be to make our guard process on
Windows (soffice.exe) have higher shutdown priority, and to send
shutdown message to soffice.bin from there. Maybe that would allow to
workaround the situation (though I still don't think that's a proper
solution). It would also lead to unclear messages when soffice.exe would
wait for soffice.bin showing "Save?" dialog: currently, Windows tells
that a program is waiting for user input; with that change, it would say
that a program is hung. :(
[1] https://bugs.documentfoundation.org/show_bug.cgi?id=109085
[2] https://en.wikipedia.org/wiki/Roaming_user_profile
[3] https://msdn.microsoft.com/en-us/library/ms686227
[4] http://stefanstools.sourceforge.net/SendMessage.html
[5] https://gerrit.libreoffice.org/39884
--
Best regards,
Mike Kaganski
More information about the LibreOffice
mailing list