Efficient UNO component linkage & GC ...

Michael Meeks michael.meeks at collabora.com
Tue Oct 8 06:23:29 PDT 2013


Hi guys,

	I just had a chat with Roi about the efforts to reduce the
amount of code being compiled / linked into LibreOffice, and I've
jotted a few notes down here on a sketch design with some code pointers:


* The background

	We want to be able to configure which pieces (particularly
components) of LibreOffice are linked into our binaries. By compiling
our binaries with --ffunction-sections - and then defining the few
entry points - we can get the linker to link to all LibreOffice code,
and then to garbage-collect any unused methods - which results in
significant size savings. This is the approach used on Android / iOS.

	Currently it is done thus:

http://cgit.freedesktop.org/libreoffice/core/tree/ios/experimental/LibreOffice/LibreOffice/lo.mm#n82

	By listing every shared library's entry point that we will
require.

	For mobile LibreOffice and various other use cases, it seems
extremely likely that UI's will not require great chunks of UI that
LibreOffice currently implements. Another interesting factor is that
(typically) the whole app will be linked into a single shared object
(much like the mergedlibs under Linux) - a Merged Library Domain.


* The problem

	To shrink the size of our builds, a current prototype uses a
lot of #ifdefs everywhere, however - this is fragile, opaque and hard
to maintain. Ideally we could select which components we want to link
into our output in a more pleasant way.

	Currently selecting per shlib is not granular enough eg. we
have a single factory for svx (eg.) that creates many components:

svx/source/unodraw/unoctabl.cxx

	From the missable 'FindTextToolbarController' to the
indispensable 'SvxShapeCollection'. In order to eliminate unused code
in a clean way, we have to split these methods.


* Proposed solution

	Part one - while there are some factories that (simply due to
the sheer volume of contents) presuambly need to stay as string
lookups - most factories should be split up, and a new factory naming
system introduced.

	Part two - we should also cleanup the generation of lists of
components / library mappings that are currently duplicated across
iOS, Android etc.

	Part three - we should use that list to avoid using string
lookups for component instantiation inside the Merged Library Domain,


** Part one - splitting factories

	Clearly we need a new naming scheme for these; this should be
based on a mangled version of the implementation name; so:

  <implementation name="com.sun.star.comp.Draw.GraphicExporter">

	->

	com_sun_star_comp_Draw_GraphicExporter_getFactory();

	Despite everyone's loathing of the css. namespace this makes
part three much easier.

	There are various ways to annotate that a .component file
eg. svx/util/svx.component refers to a shlib with a split factory. I
suggest we reduce the problem space by asserting that each shlib with
this property has it's components entirely split up - yielding a
single attribute change.

	For these we alter the prefix="svx" to prefix="<name>" or
some equivalent magic, and tweak:

stoc/source/loader/dllcomponentloader.cxx:

Reference<XInterface> SAL_CALL DllComponentLoader::activate(
    const OUString & rImplName, const OUString &, const OUString & rLibName,
    const Reference< XRegistryKey > & xKey )

	To special case this prefix, and build a symbol name (in
aPrefix) and use a new variant of:

cppuhelper/source/shlib.cxx:Reference< XInterface > SAL_CALL loadSharedLibComponentFactory(

	To load and hook out the correctly named symbol there.

	We also define these factory functions to always be passed a
valid pServiceManager, and only ever to be asked to construct the
exact service name mangled into the symbol, hence:


SAL_DLLPUBLIC_EXPORT void * SAL_CALL com_sun_star_drawing_SvxUnoColorTable_getFactory ( void * pServiceManager )
{
    // Ideally this reference, and the branch below would be subsumed
    // by a helper, such that this method call was a single line, and
    // ideally could be macro-ized elegantly.

    uno::Reference< lang::XSingleServiceFactory > xFactory;

    xFactory = cppu::createSingleFactory(
                 reinterpret_cast< lang::XMultiServiceFactory * >( pServiceManager ),
                 SvxUnoColorTable::getImplementationName_Static(),
                 SvxUnoColorTable_createInstance,
                 SvxUnoColorTable::getSupportedServiceNames_Static() );
     if( xFactory.is())
     {
        xFactory->acquire();
        return xFactory.get();
     }
     return NULL;
}


** Part two - list generation cleanup


     For this, we define during configure time a list of services
which we want to be internal, and unconditionally resolved into our
binaries. This would be in the form of a flat list of implementations:

com.sun.star.comp.Svx.GraphicExportHelper
com.sun.star.comp.Svx.GraphicImportHelper
com.sun.star.comp.graphic.PrimitiveFactory2D
...

     Instead of having great lists of shlibs and their locations (a
thing that would get significantly worse with longer / fragmented
lists) eg.

    static lib_to_component_mapping map[] = {
        { "libsvxlo.a", svx_component_getFactory },
	...

	would turn one line into 10 - we instead auto-generate both
these lines, and also the 'extern' statements required to import the
symbols required inside ios/experimental/LibreOffice/LibreOffice/lo.mm
android/experimental/desktop/native-code.cxx etc.

	It is unclear that we still require the 'lib' piece in this
picture - though that could be inferred from the aggregated .component
files. It would almost certainly be preferable to fix the:

include/osl/detail/component-mapping.h (lib_to_component_mapping)
cppuhelper/source/shlib.cxx

	To instead map implementation directly to symbol, something we
can't do programatically since we need the internally bound function
pointer listed thus:

    static uno_component_mapping map[] = {
	{ "com.sun.star.comp.Svx.GraphicExportHelper",
	  com_sun_star_comp_Svx_GraphicExportHelper" },
	...

	etc.


* Part three - faster component instantiation

	The wonderful work Noel has been doing around more beautiful
ways to instantiate components should dove-tail with this nicely. When
we have the list from part-two of which components are available
inside our Merged Library Domain, we should be able to directly
instantiate them ourselves, without using the UNO factory / activation
process; that should reduce code-size and improve performance, hence:


codemaker/source/cppumaker/cpputype.cxx

void ServiceType::dumpHxxFile(
    FileStream & o, codemaker::cppumaker::Includes & includes)
...

                  << codemaker::cpp::translateUnoToCppIdentifier(
                      "create", "method", codemaker::cpp::ITM_NONGLOBAL,
                      &cppName)
                  << ("(::com::sun::star::uno::Reference<"
                      " ::com::sun::star::uno::XComponentContext > const &"
                      " the_context) {\n");

	(this method should be split)

	should load and parse our list if internal implementations,
and produce a much simpler method eg. existing:

workdir/unxlngi6.pro/UnoApiHeadersTarget/offapi/comprehensive/com/sun/star/frame/FrameLoaderFactory.hpp:

	would turn into:

        try {
            the_instance = css::uno::Reference< css::frame::XLoaderFactory >(
		com_sun_star_frame_FrameLoaderFactory_create(the_context->getServiceManager()),
		::com::sun::star::uno::UNO_QUERY);
	} ... existing disasterous size-wise exception code ... {
	    // or better we could pass the fn' pointer and all the details
	    // we need into a 
	}

	Which should be much smaller and neater.


	Anyhow - that's my sketch / plan / rational. Thoughts / tweaks
etc. appreciated.

	ATB,

		Michael.

-- 
 michael.meeks at collabora.com  <><, Pseudo Engineer, itinerant idiot



More information about the LibreOffice mailing list