GSoC25: Week 2 and Week 3 update BASIC IDE Auto completion
Devansh Varshney
varshney.devansh614 at gmail.com
Sat Jun 7 15:52:34 UTC 2025
Hi,
So, my last status update was on mail subject: GSoC25: BASIC IDE Phase
1/Part 1 -
SymbolInfo + Cache Interface (Regarding new files for cache), May 14, 2025.
https://gerrit.libreoffice.org/c/core/+/185362
That was a rushed step, and after the first meeting, I had an "aha" moment.
I realised I should not have jumped directly into writing code and defining
structures and enums. Instead, I should have begun by figuring out what data
is available, where it resides, and how to access it.
*Core Mentor Directive from Meeting 1:*
Figure out how to get the data first. Create a table that maps "what is
there"
to "where it's from".
They want a comprehensive inventory of all the elements needed for auto-
completion (Library names, Module names, Procedure names, parameters,
variable
types, UDTs, UNO objects, methods, properties, etc.), where this information
currently resides in the LibreOffice codebase or related resources, and how
it
can be programmatically accessed.
*Mentor-Suggestions for Exploration:*
*Reflection Mechanism (Dynamic/Runtime):*
This refers to querying live UNO objects to see what they can do and getting
information by inspecting live objects. For UNO, this means inspecting
SbUnoObject. For BASIC, it might mean querying a loaded StarBASIC or
SbModule
instance for its members after it's compiled/loaded.
*Offline Availability (Static Cache):*This means pre-processing information
so it's ready when the IDE starts.
Creating a pre-compiled database of information that ships with LibreOffice.
This is particularly relevant for the vast UNO API, which is relatively
stable
between releases.
*Source for UNO:*
They suggested looking at SDK documentation, Doxygen, and hinted at the
tools
used to build the UNO system (javamaker, cppumaker), which all point back
to the
IDL files as the canonical source.
*Source for BASIC:*
This means analyzing the BASIC libraries themselves.
*"How to get the data first?" - Potential Sources Mentioned:*
*a. UNO SDK documentation to generate cache? IDL files? Doxygen output?*
*IDL Files (.idl):*
These are the canonical definitions of UNO interfaces, services, and types.
They are the ultimate source of truth for the UNO API structure. Tools like
idlc (IDL compiler) and cppumaker process these.
https://wiki.documentfoundation.org/Documentation/DevGuide/IDL_Documentation_Guidelines
https://docs.libreoffice.org/offapi.html
*Doxygen:*
If LibreOffice generates Doxygen API documentation from its C++ source
(which
it does for internal C++ APIs, and IDLs are a form of API definition), this
could be parsed. However, parsing human-readable docs is often less reliable
than parsing IDLs?
https://wiki.documentfoundation.org/Development/Doxygen
https://api.libreoffice.org/docs/idl/ref/index.html
https://api.libreoffice.org/docs/cpp/ref/index.html
https://api.libreoffice.org/
*regview tool output / RDB files (.rdb):*
UNO components register their type information in RDB files (e.g.,
services.rdb, types.rdb). The regview tool can inspect these. This is a
structured, machine-readable format of the registered UNO type system. This
is
a very strong candidate for building a static cache of UNO APIs.
https://wiki.documentfoundation.org/Documentation/SDKGuide/Coding_UNO_Components
https://www.openoffice.org/udk/common/man/tutorial/uno_registries.html
*b. "files? like python stub, javamaker, cppumaker?"*
Intermediate files generated by tools like cppumaker (which generates C++
headers from IDLs) or javamaker. Python stubs (.pyi) are a direct analogy
for
providing type information for tooling.
https://wiki.documentfoundation.org/Documentation/DevGuide/Writing_UNO_Components
https://www.openoffice.org/framework/index.html
https://wiki.openoffice.org/wiki/VBA_Stub
https://www.openoffice.org/api/docs/common/ref/com/sun/star/ucb/
XCachedDynamicResultSetStubFactory.html
https://www.openoffice.org/tools/ext_comp.html
*c. "UNO related stuff in currently opened file from Tools > Development
Tools?"*
This refers to the UNO Object Inspector (reachable via Tools -> Development
Tools, then inspect an object). This tool dynamically introspects a live UNO
object using XIntrospectionAccess to show its methods, properties, and
interfaces. This is an example of runtime reflection. We can learn from how
it
works. It shows what's possible to get dynamically.
https://www.openoffice.org/udk/cpp/man/tutorial/unointro.html
https://tomazvajngerl.blogspot.com/2021/03/built-in-xray-like-uno-object-
inspector_24.html#:~:text=The%20object%20inspector%20shows%20various%20
information%20about,is%20always%20shown%20for%20the%20current%20object.
*d. "ScriptForge"*
ScriptForge is a set of BASIC libraries providing easier access to common
UNO
tasks. Its API is relatively stable and well-defined. The .xlb files? define
the structure of BASIC libraries, including module names. The source code
is in
wizards/source/scriptforge/.
https://help.libreoffice.org/latest/en-US/text/sbasic/shared/03/lib_ScriptForge.html
https://forum.openoffice.org/en/forum/viewtopic.php?t=104896
https://help.libreoffice.org/latest/en-US/text/sbasic/shared/03/sf_intro.html
https://wiki.documentfoundation.org/Macros/ScriptForge
*e. "Object Catalog in LO"*
The current Object Catalog in the BASIC IDE (basctl/ObjectCatalog.cxx) is
very
limited. It probably doesn't have the rich data we need. However, the
concept
of an object catalog is what we're building, but a much more powerful one.
https://help.libreoffice.org/latest/en-ZA/text/sbasic/shared/uno_objects.html
?
DbPAR=DRAW&System=UNIX
https://help.libreoffice.org/25.2/ro/search?P=object+catalog
https://wiki.documentfoundation.org/Documentation/DevGuide/LibreOffice_Basic
https://www.openoffice.org/udk/common/man/uno.html
*Mentor-Mentioned Tools/Sources to Investigate:*
*"Tools > Development Tools":*
This is a live example of runtime reflection in action. It shows what's
possible when we have an instance of a UNO object.
*"ScriptForge" libraries:*
These are standard BASIC libraries shipped with LO. They must be treated as
a
source of BASIC symbols.
*"Object Catalog":*
The existing, limited sidebar. It confirms the IDE can query BasicManager
for
user macros, but it's not a source for comprehensive data.
*Key Questions Raised by Mentors:*
*Cache Strategy:* Online (live reflection/parsing) vs. Offline
(pre-computed)?
*Scalability: *How to handle the vast UNO API and potentially large user
libraries?
*Parser: *New parser vs. adapting the existing SbiParser?
*Our Conclusion from Meeting One Analysis:*
The best path forward is to create the *"Master Information Table" *and use
it to
guide a series of Proof-of-Concept (PoC) experiments. This directly
addresses
the mentors' primary request.
Which I have attached in this mail. Here I have added an additional new
entry
based on our next meeting.
-------------------------------------------------------------------------------
----------------------------
*Hossein's Email: ScriptForge Python Stubs*Key File:
wizards/source/scriptforge/python/scriptforge.pyi
https://gerrit.libreoffice.org/c/core/+/164867
*Purpose:*
This is a Python "stub file." It provides type hints and function signatures
for the ScriptForge Python library. Python IDEs use these .pyi files to
offer
auto-completion and type checking for Python developers using ScriptForge.
*Relevance to Us:*
This is a concrete example of externalizing API information for
tooling. We need to do something similar for BASIC, but our "Knowledge Base"
will be more dynamic and integrated.
*Process Inspiration:*
How is scriptforge.pyi generated or maintained? Understanding this might
give
clues for how to process other static information sources. Which I am
looking
for in more detail.
https://help.libreoffice.org/latest/en-US/text/sbasic/python/python_programming.html
https://mypy.readthedocs.io/en/stable/stubs.html
https://github.com/Amourspirit/python-types-scriptforge
https://typing.python.org/en/latest/guides/writing_stubs.html
https://help.libreoffice.org/latest/nn/text/sbasic/shared/code-stubs.html
-------------------------------------------------------------------------------
----------------------------
*Core Mentor Directive from Meeting 2:*
Compare the different ways of getting information, focusing on speed and
what
data is actually available. They want to see a dump file as concrete
evidence
from these experiments.
*Comparison of Data Acquisition Methods:*
*Method 1: Doxygen/IDL-like approach:*
This confirms their interest in the static/offline method of parsing IDL
files.
They want to know what info we can get this way.
*Method 2: Using Functions or Services:*
This refers to the runtime reflection approach. They want to know its speed
and
what data it yields.
*Key Question:*
What are the pros/cons (speed, completeness) of each?
*Concrete Output: The "Dump File"*
This is the most critical, actionable item. They want to prove that we can
get
the data. This "dump file" is the output of our PoCs. It's the evidence.
*Specific Example to Investigate: MsgBox*
This is a perfect test case. It's a built-in BASIC function.
*The Task:*
Find where MsgBox is defined in the C++ source
(basic/source/runtime/stdobj.cxx, basic/source/runtime/methods.cxx
void SbRtl_MsgBox(StarBASIC *, SbxArray & rPar, bool)).
Analyze its C++ implementation to see how its parameters are defined. Then,
compare this to how it's documented in the Help files and how *MS VBA *
presents
it. The goal is to show we can extract its full signature:
MsgBox(Prompt As String, [Buttons As Integer = 0], [Title As String]) As
Integer.
*Refining the Table:*
They suggested adding a new column. Considering the emphasis on MsgBox and
comparing it with MS VBA Help, a column like Help/VBA Analogy or Doc &
Expected
Output might be what they meant. This column would contain the
"user-friendly"
signature string, like the MsgBox example.
*How to get the dump files:*
This is what I have looked for and the possible (conceptual) ways to
get/generate the dump files. But before this, the main emphasis was on using
the debugger and understanding where what is available and going.
*1. UNO API Dumps (Offline/Static):*
*Dump 1 (IDL):*
Write a Python script? to parse a few .idl files and print their structure
(interfaces, methods, params, docs) to uno_idl_dump.txt.
*Dump 2 (regview):*
Run regview? on types.rdb and save the output to uno_regview_dump.txt.
Compare this to Dump 1.
*Dump 3 (Runtime):*
Use the C++/debugger technique to introspect a live object like StarDesktop
and
copy-paste the output of its members to uno_runtime_dump.txt.
*2. BASIC Built-in Dumps (Offline/Static):*
*Dump 4 (Built-ins):*
Write a script/code/test? to iterate the SbRtl object's methods and print
their
names to basic_builtins_dump.txt. Manually add the extra columns for MsgBox
and
a few others by reading their C++ implementation, just like filling in our
Excel sheet.
*3. User Code Dumps (Dynamic/IDE-Time):*
*Dump 5 (Parser Symbols):*
Modify the SbiParser in a test branch to, instead of compiling, just dump
the
contents of its symbol pool (SbiSymPool) after it parses a test .bas file.
Save this to user_code_symbols_dump.txt.
Now regarding the file *BASIC_IDE_MODULE_FLOW.odt*:
While working for the Master Analyzer Table sheet, I felt that it would be
great if we had a module-wise separation of files. That would be good to
segregate and identify the sources and their regions of possible
availability.
----------------------------------------------------------------------------------
---------------------------
So for this week my task is to run some BASIC scripts and investigate where
what is going, and I will also re-validate my BASIC_IDE_MODULE_FLOW.odt
if something is off with it. Go in full debugging mode and get the dump
files to prove what we can get and simultaneously look in MSO VBA what
they have and are doing.
Maybe later, the file BASIC_IDE_MODULE_FLOW.odt could lay down the
foundation for documenting LibreOffice BASIC source code?
https://ask.libreoffice.org/t/where-can-i-find-documentation-for-
libreoffice-basic/95965
https://learn.microsoft.com/en-us/office/vba/library-reference/concepts/
getting-started-with-vba-in-office
https://learn.microsoft.com/en-us/office/vba/api/overview/
I would love to have feedback on this, and if there's anything similar or
important I should be looking into, please let me know.
--
*Regards,*
*Devansh*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20250607/58a96f87/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Master_Information_Table_Basic.ods
Type: application/vnd.oasis.opendocument.spreadsheet
Size: 21593 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20250607/58a96f87/attachment.ods>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BASIC_IDE_MODULE_FLOW.odt
Type: application/vnd.oasis.opendocument.text
Size: 73413 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20250607/58a96f87/attachment.odt>
More information about the LibreOffice
mailing list