[Libreoffice-bugs] [Bug 134641] New: binaryurp bridge termination sporadically causes DisposedException in a different bridge

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Wed Jul 8 09:07:58 UTC 2020


https://bugs.documentfoundation.org/show_bug.cgi?id=134641

            Bug ID: 134641
           Summary: binaryurp bridge termination sporadically causes
                    DisposedException in a different bridge
           Product: LibreOffice
           Version: Inherited From OOo
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: medium
         Component: sdk
          Assignee: libreoffice-bugs at lists.freedesktop.org
          Reporter: marc-oliver.straub at advantest.com

Termination of an binaryurp bridge (eg. because the remote process crashes) can
cause DisposedException in a different bridge. Expectation is that different
bridges are not affected by termination of other bridges.

We had 3 processes communicating to each other using binaryurp bridges:
Process A <-> process B
Process B <-> process C
Process A and process C don't talk to each other.

Process A requests process B to execute a method. As part of this method,
process B needs to call process C:

A: call doSomethingInProcessB(), waiting for result
B: execute doSomethingInProcessB(), will now call doSomethingInProcessC()
C: idling

Process A is now terminated (due to one of its threads crashing, a kill, ...).
Process B notices that the bridge to process A has terminated and calls
ThreadPool::dispose(nDisposeId). ThreadPool::dispose(..) walks through all
JobQueues, calling JobQueue::dispose(nDisposeId).

Since the doSomethingInProcessB()-call is still being processed, the associated
 JobQueue contains the nDisposeId as topmost entry in the callstack.
JobQueue::dispose(..) finds the disposeId and sets it to 0. It signals
m_cndWait so that the bridge can terminate (jobqueue.cxx:143)

Concurrently to this, the worker thread currently working on
doSomethingInProcessB() wants to call doSomethingInProcessC(). The IPC is sent
out and JobQueue::enter(..) is called to wait for the result.
JobQueue::enter(..) puts a different disposeId onto the callstack (since the
call uses a different bridge) and should block on m_cndWait.wait() to wait for
the result (jobqueue.cxx:73)

But m_cndWait has been signalled by JobQueue::dispose(), so JobQueue::enter(..)
doesn't block - but m_lstJobs is still empty (jobqueue.cxx:98). It resets the
m_cndWait and returns a nullptr, which is converted into a DisposedException by
Bridge::makeCall() (bridge.cxx:610) - even though the bridge to process C is
completely intact at this point in time.

I'd suggest the following fixes:
* JobQueue::enter() should check for job == nullptr after resetting m_cndWait
in jobqueue.cxx:98. If so, it should continue waiting instead of returning
nullptr. This will avoid the DisposedException, the call to
doSomethingInProcessC() will work correctly.

* JobQueue::enter() should check for m_lstCallstack == 0 and m_lstJob.empty()
after processing a request (jobqueue.cxx:109). This will ensure that the bridge
will correctly terminate once doSomethingInProcessB() has finished.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20200708/01d0cde9/attachment.htm>


More information about the Libreoffice-bugs mailing list