[Libreoffice-bugs] [Bug 134641] New: binaryurp bridge termination sporadically causes DisposedException in a different bridge
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Wed Jul 8 09:07:58 UTC 2020
https://bugs.documentfoundation.org/show_bug.cgi?id=134641
Bug ID: 134641
Summary: binaryurp bridge termination sporadically causes
DisposedException in a different bridge
Product: LibreOffice
Version: Inherited From OOo
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: medium
Component: sdk
Assignee: libreoffice-bugs at lists.freedesktop.org
Reporter: marc-oliver.straub at advantest.com
Termination of an binaryurp bridge (eg. because the remote process crashes) can
cause DisposedException in a different bridge. Expectation is that different
bridges are not affected by termination of other bridges.
We had 3 processes communicating to each other using binaryurp bridges:
Process A <-> process B
Process B <-> process C
Process A and process C don't talk to each other.
Process A requests process B to execute a method. As part of this method,
process B needs to call process C:
A: call doSomethingInProcessB(), waiting for result
B: execute doSomethingInProcessB(), will now call doSomethingInProcessC()
C: idling
Process A is now terminated (due to one of its threads crashing, a kill, ...).
Process B notices that the bridge to process A has terminated and calls
ThreadPool::dispose(nDisposeId). ThreadPool::dispose(..) walks through all
JobQueues, calling JobQueue::dispose(nDisposeId).
Since the doSomethingInProcessB()-call is still being processed, the associated
JobQueue contains the nDisposeId as topmost entry in the callstack.
JobQueue::dispose(..) finds the disposeId and sets it to 0. It signals
m_cndWait so that the bridge can terminate (jobqueue.cxx:143)
Concurrently to this, the worker thread currently working on
doSomethingInProcessB() wants to call doSomethingInProcessC(). The IPC is sent
out and JobQueue::enter(..) is called to wait for the result.
JobQueue::enter(..) puts a different disposeId onto the callstack (since the
call uses a different bridge) and should block on m_cndWait.wait() to wait for
the result (jobqueue.cxx:73)
But m_cndWait has been signalled by JobQueue::dispose(), so JobQueue::enter(..)
doesn't block - but m_lstJobs is still empty (jobqueue.cxx:98). It resets the
m_cndWait and returns a nullptr, which is converted into a DisposedException by
Bridge::makeCall() (bridge.cxx:610) - even though the bridge to process C is
completely intact at this point in time.
I'd suggest the following fixes:
* JobQueue::enter() should check for job == nullptr after resetting m_cndWait
in jobqueue.cxx:98. If so, it should continue waiting instead of returning
nullptr. This will avoid the DisposedException, the call to
doSomethingInProcessC() will work correctly.
* JobQueue::enter() should check for m_lstCallstack == 0 and m_lstJob.empty()
after processing a request (jobqueue.cxx:109). This will ensure that the bridge
will correctly terminate once doSomethingInProcessB() has finished.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20200708/01d0cde9/attachment.htm>
More information about the Libreoffice-bugs
mailing list