Re: GSoC 25: BASIC IDE - Object Browser – From Crash-Fest to Solid Rock [Week 11]

Devansh Varshney varshney.devansh614 at gmail.com
Sat Aug 16 11:05:57 UTC 2025


Hi everyone,

This week marks a critical turning point in our Object Browser
investigation.
We've completed a comprehensive four-phase analysis that has fundamentally
changed our understanding of the crash patterns. While we've made
significant
progress in stabilizing the IDE, we've also uncovered a new crash scenario
that requires immediate attention.

*Gerrit Patches*

*Patch 28 (Week 10): Thread-Safe Initialization System*
    https://gerrit.libreoffice.org/c/core/+/186822/28

*The Problem: Slaying the Initialization Hydra **a Multi-Headed Beast*

Our previous initialization system was fundamentally flawed. Multiple
threads
could trigger initialization simultaneously, creating a race condition that
manifested as:

info:basctl:96852:430736002:basctl/source/basicide/idedataprovider.cxx:60:
UnoHierarchyInitThread starting
info:basctl:96852:430736003:basctl/source/basicide/idedataprovider.cxx:60:
UnoHierarchyInitThread starting
info:basctl:96852:430736014:basctl/source/basicide/idedataprovider.cxx:60:
UnoHierarchyInitThread starting

*This chaotic initialization caused:*
- Severe performance degradation (6+ second startup times)
- Resource conflicts between competing threads
- IDE freezing during startup

*The Solution: A Coordinated State Machine*

We implemented a sophisticated thread-safe initialization system using
modern
C++ concurrency primitives:

// New Architecture: Double-Checked Locking Pattern
enum class InitState { NotInitialized, Initializing, Initialized, Failed,
Disposed };

void ObjectBrowser::Initialize()
{
    // Fast lock-free check first
    InitState currentState = m_eInitState.load();
    if (currentState == InitState::Initialized || currentState ==
InitState::Initializing)
        return;

    // Acquire lock for definitive check
    std::unique_lock<std::mutex> lock(m_InitMutex);
    currentState = m_eInitState.load();
    if (currentState == InitState::Initialized || currentState ==
InitState::Initializing)
        return;

    // Set state while holding lock, then release for long operation
    m_eInitState.store(InitState::Initializing);
    lock.unlock();

    // ... safe initialization ...
}

void IdeDataProvider::AsyncInitialize(...)
{
    // Atomic compare-and-swap ensures single initialization
    if (!m_bInitializationInProgress.compare_exchange_strong(expected,
true))
        return; // Only first thread succeeds
}

*The Result: Order from Chaos*

After Patch 28 - Clean, Sequential Initialization:
info:basctl:79942:495124713:basctl/source/basicide/idedataprovider.cxx:60:
UnoHierarchyInitThread starting
info:basctl:79942:495124973:basctl/source/basicide/idedataprovider.cxx:141:
UnoHierarchyInitThread completed in 1162 ms

Performance transformation:
- 80% reduction in initialization time (6+ seconds → ~1.2 seconds)
- Single, controlled initialization thread
- Eliminated resource conflicts and race conditions

-
-------------------------------------------------------------------------------

*Patch 29 (Week** 10-11): Deadlock Fix in Data Provider Callback*
    https://gerrit.libreoffice.org/c/core/+/186822/29

⁦*Thread 0x1f10d90f (Main Thread):*
basctl::ObjectBrowser::RefreshUI(bool)
basctl::IdeDataProvider::GetTopLevelNodes()
std::__1::lock_guardstd::__1::mutex::lock_guard

* Thread 0x1f1147f9 (Background Thread):*
basctl::IdeDataProvider::UnoHierarchyInitThread::run()
basctl::ScriptDocument::getLibraryNames()
basic::SfxLibraryContainer::getElementNames()
basic::SfxLibraryContainer::enterMethod()
comphelper::SolarMutex::acquire()

    *Problem: Subtle deadlock in IdeDataProvider::AsyncInitialize*
    Solution: Improved thread synchronization with atomic compare-and-swap:
        void basctl::IdeDataProvider::AsyncInitialize(...)
        {
            m_pThreadController = pController;
            bool expected = false;

            // Atomic compare-and-swap ensures only one thread starts
initialization
            if (!m_bInitializationInProgress.compare_exchange_strong(expected,
true))
            {
                // If already completed, call callback immediately
                if (m_bInitialized)
                    Application::PostUserEvent(rFinishCallback);
                return;
            }
            // Create and start initialization thread
        }


-
-------------------------------------------------------------------------------

*Patch 30 (Week 11): Enhanced Disposal Order*
    https://gerrit.libreoffice.org/c/core/+/186822/30

    *Problem: TaskPanelList registration failures and disposal order*
    Solution: Comprehensive disposal sequence:
        void ObjectBrowser::dispose()
        {
            // 1: Atomic Guard to prevent re-entry
            bool expected = false;
            if (!m_bDisposed.compare_exchange_strong(expected, true))
                return;

            // 2: Check parent hierarchy validity
            if (!GetParent() || !GetParent()->GetSystemWindow())
            {
                // Minimal cleanup if parent is gone
                DockingWindow::dispose();
                return;
            }

            // 3: Remove pending events and hide
            EnableInput(false);
            Hide();
            Application::RemoveMouseAndKeyEvents(this);
            Application::Reschedule(true);

            // 4: Update state machine
            m_eInitState.store(InitState::Disposed);
            m_InitCV.notify_all();

            // 5: Unregister from TaskPanelList BEFORE widget disposal
            if (GetParent() && GetParent()->GetSystemWindow())
            {
                TaskPaneList* pTaskPaneList = GetParent()->GetSystemWindow()
->GetTaskPaneList();
                if (pTaskPaneList)
                    pTaskPaneList->RemoveWindow(this);
            }

            // 6: Comprehensive cleanup of all resources
            // ... widget disposal, thread cancellation, etc. ...
        }



-
-------------------------------------------------------------------------------
-
-------------------------------------------------------------------------------

*The Four Investigation Phases: A Systematic Approach*

*I. Investigation Phase 1: Initial Disposal Fixes*

*The Problem:*
We began with a classic use-after-free crash in the macOS event system:

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib         0x18be51388 __pthread_kill + 8
1   libvclplug_osxlo.dylib         0x1208e44d4 *-[SalFrameView mouseDown:]
+ 76*

*Root Cause Analysis:*
The crash occurred because mouse events were being delivered to a disposed
window object. macOS Cocoa maintains references to view objects even after
logical disposal, creating a race condition between VCL's disposal and
Cocoa's event delivery.

*Initial Fix Attempts:*
- Added atomic disposal flag (m_bDisposed) to prevent re-entry
- Added EnableInput(false) and Hide() calls
- Added Application::RemoveMouseAndKeyEvents(this)
- Added event handler disconnection

*Result:* Crash persisted with the same signature, indicating deeper issues.

*Diagram:* Initial Disposal Problem

    VCL Disposal           Cocoa Event System
    -----------           ----------------
    | dispose() |  ---->  | Event Queue |
    -----------           ----------------
          |                     |
          v                     v
    | Object freed |       | Events still |
    | (C++)        |       | referencing  |
                           | disposed obj |
                           ----------------

-
-------------------------------------------------------------------------------
*II. Investigation Phase 2: Deep Analysis and Pattern Recognition*

*The Breakthrough:*
Research into LibreOffice's VCL architecture revealed critical patterns:

1. Frame Tracking System: *AquaSalFrame* objects are registered when created
   and deregistered when destroyed.

2. Frame Validity Checking: The system uses AquaSalFrame::isAlive() checks
   throughout the codebase.

3. Standard Disposal Pattern: Other VCL components follow a specific
   disposal sequence.

*Key Discovery:* Parent-Child Window Relationship Issue

The real problem wasn't just in the ObjectBrowser, but in the entire window
hierarchy:

1. Basic Macros dialog (parent) opens
2. Basic Macros dialog opens IDE (child)
3. IDE creates ObjectBrowser (grandchild)
4. User closes IDE - ObjectBrowser disposed
5. Critical Issue: ObjectBrowser not properly removed from VCL frame
tracking
6. User closes Basic Macros dialog (parent)
7. CRASH: Basic Macros dialog tries to access dangling frame references

*Diagram: Window Hierarchy Problem*

    [Basic Macros Dialog] (Parent)
           |
           v
    [IDE Window] (Child)
           |
           v
    [ObjectBrowser] (Grandchild) <-- Disposed but not deregistered
           |
           v
    [VCL Frame Tracking] <-- Still has reference to disposed ObjectBrowser

+---------------------------+       +---------------------------+
|      Main UI Thread       |       |    Background Thread      |
|---------------------------|       |---------------------------|
| 1. Acquires SolarMutex    |       | 3. Needs to get macros    |
|    (for UI/doc access)    |       |                           |
|                           |       | 4. Tries to acquire       |
| 2. Waits for Background   | <---- |    SolarMutex (BLOCKED!)  |
|    Thread to finish.      | ----> |                           |
|      (BLOCKED!)           |       | 5. Waits for Main Thread  |
|                           |       |    to release lock.       |
+---------------------------+       +---------------------------+```

**The Fix (Patch 29):** We re-architected the data loading. The background
thread was simplified to *only* perform the thread-safe task of building the
UNO API cache. The non-thread-safe work of querying Basic macros was moved
to
the `OnDataProviderInitialized` handler, which executes safely on the main
thread. This successfully resolved the deadlock.

// -- PATCH 29: DEADLOCK FIX --
IMPL_LINK(ObjectBrowser, OnDataProviderInitialized, void*, /* p */, void)
{
    // This now runs safely on the Main Thread
    // 1. Create a list of top-level nodes...
    // 2. Add UNO APIs node (data is ready from background)...
    // 3. Add Application/Document Macros (safe to query now)...
    // 4. Set complete data and build indexes ONCE...
    // 5. Refresh UI...
}

-
-------------------------------------------------------------------------------
*III. Investigation Phase 3: Mouse Drag Timer Crash*

*New Crash Pattern:*
The problem evolved from simple use-after-free to memory corruption:

Thread 0 Crashed::
...
6   libvclplug_osxlo.dylib    *-[SalFrameView mouseDraggedWithTimer:] + 100*
7   Foundation                __NSFireTimer + 104
...

*Exception Type:* EXC_BAD_ACCESS (SIGSEGV)
*Exception Codes: *KERN_INVALID_ADDRESS at 0x00001cb5987c0bde

*Root Cause:* Incomplete cleanup of macOS-specific resources, particularly
timer-based mouse drag operations.

*Evidence from Code:*
The SalFrameView creates NSTimers for mouse drag events:

-(void)mouseDragged: (NSEvent*)pEvent
{
    [self clearPendingMouseDraggedEvent];
    mpPendingMouseDraggedEvent = [pEvent retain];
    if ( !mpMouseDraggedTimer )
    {
        mpMouseDraggedTimer = [NSTimer scheduledTimerWithTimeInterval:0.025f
            target:self selector:@selector(mouseDraggedWithTimer:)
            userInfo:nil repeats:YES];
    }
}

These timers can fire after disposal, accessing freed memory.

*Diagram: Timer-Based Crash(My Understanding)*

    Mouse Drag Event
           |
           v
    [NSTimer Created] --> [SalFrameView]
           |                   |
           v                   v
    [0.025s Interval]    [ObjectBrowser Disposed]
           |                   |
           v                   v
    [Timer Fires] -------> [CRASH: Accessing freed memory]



-
-------------------------------------------------------------------------------
*IV. Investigation Phase 4: Delayed Disposal Experiment*

*The Hypothesis:*
Perhaps the issue was timing-related. If we delayed disposal, maybe the
Cocoa event system would have time to clean up.

*Implementation:*
We implemented a delayed disposal pattern:

void ObjectBrowser::dispose()
{
    // Immediate cleanup
    Hide();
    Application::PostUserEvent(LINK(this, ObjectBrowser,
DelayedDisposeHandler));
}

IMPL_LINK(ObjectBrowser, DelayedDisposeHandler, void*, /*p*/, void)
{
    // Actual disposal happens later
    performActualDisposal();
}

*Expected Outcome:*
- Prevent mouse event crashes by allowing proper cleanup
- Maintain VCL frame tracking
- Allow clean IDE closure

*Actual Result: New Problems Introduced*
1. Mouse Event Crash Fixed: Original crashes no longer occurred
2. New Problem: UI Artifacts and Freezing
   - ObjectBrowser disappeared but left visual artifact
   - IDE became unresponsive, showing <NO_MODULE>
   - Document Recovery UI appeared
   - IDE reloaded instead of closing cleanly

*Root Cause:* VCL Synchronous vs. Asynchronous Conflict

VCL expects immediate disposal:
    Parent Layout -> Child Disposal -> Layout Update

Delayed disposal broke this contract:
    Parent Layout -> Child Scheduled for Disposal -> Layout Updates
    (but child still exists)

*Diagram: Delayed Disposal Conflict*

    VCL Expectation:
    [Layout] -> [Dispose Child] -> [Update Layout]

    Delayed Disposal Reality:
    [Layout] -> [Schedule Dispose] -> [Update Layout]
                                    |
                                    v
                            [Child still exists]
                                    |
                                    v
                            [<NO_MODULE> displayed]


-
-------------------------------------------------------------------------------

*New Discovery: Ghost Parent UI Crash*

*The Scenario:*
We identified two distinct workflows with different outcomes:

*Scenario A (Works Correctly):*
- Context: A document (Calc) is open
- Action: Tools > Macros > Edit
- Result: BASIC Macros dialog closes, IDE opens cleanly
- When closing IDE, everything shuts down properly

*Scenario B (Crashes):*
- Context: No document is open (global soffice context)
- Action: Tools > Macros > Edit
- Result: IDE opens, but BASIC Macros dialog remains open in background
- Clicking on this "ghost parent" window causes the mouseDown crash

*ASCII Diagram: Ghost Parent Problem*

    With Document Context:         Without Document Context:
    [Calc Document]               [No Document]
           |                             |
           v                             v
    [BASIC Macros]                 [BASIC Macros] <-- Ghost Parent
           |                             |
           v                             v
    [IDE Opens]                    [IDE Opens]
           |                             |
           v                             v
    [Parent Closes]                [Parent Stays]
                                          |
                                          v
                                  [Click on Parent]
                                          |
                                          v
                                       [CRASH]

*Root Cause Analysis:*
The issue is that the BASIC Macros dialog is being put into a "zombie" state
when no document is open. It's not properly closed when the IDE opens,
leaving it in the window hierarchy with an inconsistent state.

*// In sfx2/source/appl/appserv.cxx*

#if HAVE_FEATURE_SCRIPTING
        case SID_BASICIDE_APPEAR:
        { }

and
        case SID_MACROORGANIZER:
        {
            SAL_INFO("sfx.appl", "handling SID_MACROORGANIZER");
            const SfxItemSet* pArgs = rReq.GetArgs();
            sal_Int16 nTabId = 0;
            Reference <XFrame> xFrame;
            if (pArgs)
            {
                if (const SfxUInt16Item* pItem =
pArgs->GetItemIfSet(SID_MACROORGANIZER,
false))
                    nTabId = pItem->GetValue();
                if (const SfxBoolItem* pItem = rReq.GetArg<SfxBoolItem>(FN_
PARAM_2))
                {
                    // if set then default to showing the macros of the
document associated
                    // with this frame
                    if (pItem->GetValue())
                        xFrame = GetRequestFrame(rReq);
                }
            }
            SfxApplication::MacroOrganizer(rReq.GetFrameWeld(), xFrame,
nTabId);
            rReq.Done();
        }
        break;


*// In vcl/osx/salframeview.mm <http://salframeview.mm>*
-(void)mouseDown: (NSEvent*)pEvent
{
    if ( mpMouseEventListener != nil &&
        [mpMouseEventListener respondsToSelector: @selector(mouseDown:)])
    {
        [mpMouseEventListener mouseDown: pEvent];
    }

    s_nLastButton = MOUSE_LEFT;
    [self sendMouseEventToFrame:pEvent button:MOUSE_LEFT
eventtype:SalEvent::MouseButtonDown];
}


*Current Status & Next Steps*
Successfully Resolved:
1. IDE Shutdown Crashes - Eliminated through proper disposal order (Patch
30)
2. Multiple Initialization Threads - Solved with performance gains (Patch
28)
3. Deadlock in Data Provider - Fixed with proper callback handling (Patch
29)

*Critical Issues Remaining:*
1. Ghost Parent UI Crash (NEW CRITICAL PRIORITY)
   - Occurs when opening IDE without a document context
   - Clicking on the lingering BASIC Macros dialog causes crash
   - Requires immediate investigation and fix

2. Mouse Event Crashes Post-Disposal (HIGH PRIORITY)

3. History Navigation Failures (MEDIUM PRIORITY)
   - Back/forward buttons become disabled after first use
   - History system doesn't preserve full UI state

4. Breaking down this large single patch in multiple chronological patches.

5. In Right Pane when we double click on an element it should open the API
Page window.

3. Adding a small delay in search and make search results better & include
Macros results too.


-
-------------------------------------------------------------------------------

*Next Steps for Week 12:*
Priority 1: Ghost Parent UI Investigation
- Determine why BASIC Macros dialog doesn't close in no-document context
- Implement proper dialog closure when IDE opens from global context
- Test both document and no-document scenarios

Priority 2: Enhanced Event Handler Cleanup
- Review all event connections in ObjectBrowser
- Ensure complete disconnection in disposal method
- Add frame validity checks to SalFrameView mouse handlers

Priority 3: Patch Breakdown Strategy
- Break large patches into smaller, focused changes
- Enable incremental review and testing by community


-
-------------------------------------------------------------------------------

*Technical Evolution: Lessons Learned*

1. Disposal Order is Critical
   The sequence of operations in dispose() matters immensely.
   TaskPanelList removal must happen early.

2. Thread Safety Requires Multiple Layers
   Single boolean flags are insufficient.
   Need atomic operations, mutexes, and condition variables.

3. macOS Event System is Complex
   Timer-based events can outlive object disposal.
   Need comprehensive cleanup of all native resources.

4. Context Matters
   The same action can have different results depending on
   application state (document vs. no-document context).


-
-------------------------------------------------------------------------------

*Conclusion*

Our four-phase investigation has provided deep insights into the complex
interactions between VCL and macOS. While we've made significant progress in
stabilizing the Object Browser, the discovery of the ghost parent UI crash
shows that there are still fundamental issues to resolve.

The architectural improvements in thread safety and disposal management
provide
a strong foundation, but we must now address the window lifecycle management
issues that cause the ghost parent problem.

Thanks to mentors for their invaluable guidance throughout this complex
investigation.

The point is the crash which we are seeing is not happening after patch ~26
is with the
BASIC IDE rather it is happening with the IDEs parent BASIC Macro UI Window
if I am opening it via
the main soffice(LibreOfficeDev) UI

Here is the video link to understand better what is exactly going on -

https://www.youtube.com/watch?v=gTwWkYQKLxk

I do was having this thought about this ghost parent Ui window remaining
that it wasn't used to there when
I started working on after opening the IDE but now I am assured that the
IDE with the New OB is closing
well and now we can break this patch chronologically so that others in
community can test it :)
and can do the remaining UI/UX polish and the major part which is code
suggestions can be done quickly.



*Previous Updates:*

Week 1:
https://lists.freedesktop.org/archives/libreoffice/2025-May/093264.html

Weeks 2-3:
https://lists.freedesktop.org/archives/libreoffice/2025-June/093362.html

Week 4:
https://lists.freedesktop.org/archives/libreoffice/2025-June/093392.html

Week 5:
https://lists.freedesktop.org/archives/libreoffice/2025-June/093443.html

Week 6:
https://lists.freedesktop.org/archives/libreoffice/2025-July/093493.html

Week 7:
https://lists.freedesktop.org/archives/libreoffice/2025-July/093527.html

Week 8:
https://lists.freedesktop.org/archives/libreoffice/2025-July/093572.html

Week 9-10:
https://lists.freedesktop.org/archives/libreoffice/2025-August/093662.html

If there is any mistake or something I missed in my understanding do let me
know :)


-- 
*Regards,*
*Devansh*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20250816/be12653a/attachment.htm>


More information about the LibreOffice mailing list