[Libreoffice] minor idl fixes

Tomas Hlavaty tom at logand.com
Mon Dec 12 15:52:27 PST 2011


Hi Michael,

>> This would allow us to get rid of the RDB files (although I need to
>> familiarise myself with current use-cases to understand the impact of
>> such change, e.g. merging in custom plugins).
>
> 	So, there are rather a number of hidden criteria for RDB files:

> that they are tiny,

What does "tiny" mean?

Currently, rdb files are giant.

     49 2011-07-12 21:36 /var/lib/openoffice/basis3.2/program/services.rdb
2031616 2011-01-26 23:32 /usr/lib/openoffice/basis3.2/program/oovbaapi.rdb
6520832 2011-01-26 23:31 /usr/lib/openoffice/basis3.2/program/offapi.rdb
 262144 2011-01-27 00:17 /usr/lib/ure/share/misc/services.rdb
 851968 2011-01-26 23:29 usr/lib/ure/share/misc/types.rdb

I'm not sure why.  If I simply concatenate all idl definitions for
udkapi and offapi into one preprocessed file I get smaller file while
still being a valid idl file containing all the information:

1695058 2011-12-12 23:55 allpp.idl

The makefile rules for allpp.idl file goes along this line:

offapi.list:
	find $(LIBO)/offapi/ -name '*.idl' >$@
udkapi.list:
	find $(LIBO)/udkapi/ -name '*.idl' >$@
all.list: offapi.list udkapi.list
	cat udkapi.list offapi.list >$@
all.idl: all.list
	sed -e "s@/opt/libo/udkapi/@@g" -e "s@/opt/libo/offapi/@@g" -e "s at .*@#include <&>@g" $< >$@
allpp.idl: all.idl
	cpp -P -I$(LIBO)/offapi -I$(LIBO)/udkapi $< >$@

If I compress allpp.idl I get rather small file containing complete type
information for udkapi and offapi:

 212524 2011-12-12 23:55 allpp.idl.gz

Is 200kB considered tiny?

And this is just original concatenated idl files.

> instant to parse (and/or don't require parsing) - since we get to do
> this quite a lot at startup (which is already not as performant as it
> could be ;-).

How long does reading the type information take at the moment?

What do we get to do a lot at startup?  I thought we simply load it an
that's it.

If the new format is a text format (I would prefer text format over
another binary one), there needs to be some parsing.  unoidl2 can parse
the allpp.idl file (containing all type information) and print the
syntax tree in about 200ms:

   $ rm allpp.ast 
   $ time make allpp.ast
   cat allpp.idl | ./unoidl2ast >allpp.ast

   real  0m0.247s
   user  0m0.170s
   sys   0m0.100s

I think that should be about the worst case achievable, since any
cleverer format than the original idl syntax should be faster to parse.
But maybe the difference won't be significant.

If 200ms is slow, we could split the allpp,idl file into something
smaller required at startup and the rest loaded lazily.

We could have a binary format, something like a mmap dump.  That would
be instant but rather ugly.

> The data needs to be in a small (read three or less) number of files -
> to avoid I/O seek latency on rotating media.

OK.

Are there any other requirements?  Like functionality related to
rdbmerge and how extensibility works?  Or is that not relevant anymore?

>> The other affected LO projects would likely be:
>
> 	Well  all of these other guys -should- work on top of the
> typedescription API (I would hope), so as long as that is in-place, life
> will be good I think.

I was under impression that these projects somehow depend on the rdb
code, but if they depend on the typedescription api, then it is better
then I hoped (if that typedescription api is somehow separate from the
rdb file code).

> 	Sadly, the plain-C UNO bridge died a death some years back I think;
> though this was originally intended to be possible [ the base sal/
> library still has a C ABI/API ].

OK, I'll have a look at sal.

Thank you,

Tomas


More information about the LibreOffice mailing list