Help/advise asked with deadlocks when opening Visio objects in Writer
Michael Meeks
michael.meeks at collabora.com
Wed Jan 15 12:56:54 PST 2014
Hi Winfried,
On Wed, 2014-01-08 at 12:00 +0100, Winfried Donkers wrote:
> Background information:
> The company I work for uses MS Visio to create illustrations, which are
> embedded into Writer documents (and not saved separately as Visio document).
Right =) hopefully this is getting a bit better these days with the
Visio & EMF+ fixes we've been doing for 4.2.
> Since we started using LibreOffice versions later than 3.5 (I think), we
> have seemingly random problems with LibreOffice freezing when opening an
> embedded Visio object.
Ok - do you have a stack-trace ? if you run 4.2 of course you can get
debugging symbols for that which would make the problem debuggable at
least - a deadlock is a beast that gives a nice stack trace (if you get
all threads).
> The problem is very hard to reproduce (I have been trying for months, and
> have succeeded only once*), but it can occur multiple times on a single
> day for a single user.
If we can catch it there and get a good bug filed we can perhaps do
something about it.
> The only way out is to kill LibreOffice (or Visio if you're lucky) with
> loss of recent changes as result. For my colleagues it is an extremely
> annoying problem and it feeds strong anti-LibreOffice feelings.
Sounds like it would =)
> Today, I had a breakthrough: a colleague reported that he received an
> error message, "lgemene OLE fout". Normally, we don't get that, LibreOffice
> just freezes.
> This string and opengrok led to ERRCODE_SO_GENERALERROR belonging to the string,
> /core/sfx2/source/view/ipclient.cxx, which is the only file where ERRCODE_SO_GENERALERROR is used.
How do you get from there to here:
> This led to a TODO-comment in /core/embeddedobj/source/commonembedding/embobj.cxx, OCommonEmbeddedObject::doVerb( ... ):
> " TODO: a gross hack to avoid deadlocks [...] "
Of course - that code is a shambols - as is most of our 'threading'
code which is mostly based on superstition and a feeling of safety bread
from races not happening that often in the real world =)
> I know that the SolarMutex issue is getting attention, and that area is
> far beyond my capabilities.
I rather suspect that the SolarMutex is not the problem in itself; but
the non-Solar mutex :-) Solar Mutex locking is relatively tractable and
comprehensible. The problem mostly comes when people try to be too
clever and use another mutex: which is a recipe for deadlocks.
> But could there be a way to recognize these deadlocks and kill these
> deadlocks without killing the LibreOffice application ?
Assuming that we can write a quick mutex (and bear in mind that we
pointlessly take bazillions of these and release them again - just for
the sheer joy of it ;-) often on each method call) that can detect a
deadlock. Then what do we do ? :-) of course, we could try to break the
lock, steal the mutex from the other thread, and let one thread run on
and hand the mutex back later when it was unlocked ;-) but - it seems
like an horrific way to try to deal with the underlying issue.
> Possibly, with help from the experts, I might be able to create a
> temporary 'patch' ...
> I have not created a bug report for this, since I could find no way
> to reproduce the problem. Depending on your reaction(s) I will
> create the bug report.
As Michael says - if we get a trace [ of all threads! ] we should have
something to go on to fix this and (often) we can fix the problem easily
enough - often as a side-benefit introducing another threading hazard
elsewhere ;->
> *I ran version 4.1.4 and 4.2.0 concurrently, opened a Visio object in
> one of the two and both froze. Killing the one with the Visio object,
> made the other accessible again.
Interesting; well if you can reproduce this that's most interesting of
course.
Thanks for persisting ! looking forward to the trace,
ATB,
Michael.
--
michael.meeks at collabora.com <><, Pseudo Engineer, itinerant idiot
More information about the LibreOffice
mailing list