extraordinary stoc/ thrash ...

Stephan Bergmann sbergman at redhat.com
Tue Apr 3 23:57:16 PDT 2012


On 04/03/2012 11:13 PM, Tomas Hlavaty wrote:
> Some time ago we discussed something along the lines of modernizing rdb
> related code.
>
> I wrote an IDL parser and the java generator unoidl2java is getting
> almost complete.  I have a few small patches lined up but I'd like to
> get the java generator as far as I can so that I get a good picture of
> what is it all about.  After that I'd like to explore what would be the
> best format for a new rdb registry; maybe binary, maybe text, maybe
> preprocessed idl or preparsed somehow etc.

Great to hear.

>>> but writing a duplicate, much simpler xml parser at a higher level to
>>> read only the new XML .rdb files and whacking them straight into a
>>> hash or two might be a nice easy-hack to have around :-)
>
> I might be missing context here but haven't we discussed some time ago
> (probably with Michael) that speed and size are crucial for the
> registry?  XML might not look like a good idea then.

First of all, remember that we have two different scenarios where we 
want to replace the old binary rdb format with something else, once for 
type information and once for service information.

Second, in this context here (creating a service manager on top of the 
available service information at LO start up), it looks like the most 
expensive part is the nested XRegistry instances.  So that becomes the 
natural place to tackle first.  (The decision to replace the old binary 
rdb format with an XML format for service information was mostly born 
out of convenience.  In itself, it does not look especially problematic, 
performance-wise.  While this is certainly open for debate and 
inspection, I just don't think it is relevant for the problem at hand.)

>> I think the problem is not the XML parsing, but the nested XRegistry
>> list.
>
> Does "nested XRegistry list" mean the registry structure mirroring the
> symbol hierarchy of uno packages/classes?  Why is that a problem?

No.  The nesting (or linear list, rather) represents the various files 
from which service information is obtained.  (And one of the most 
important ones, LO's program/services/services.rdb tends to only come 
near the very end of that list.)

>> I have a vague idea of placing yet another cppuhelper bootstrap
>> mechanism next to the existing ones, which will internally use
>> completely different (read: cheap) mechanisms to set up a component
>> context and associated service manager.  That way, I would avoid most
>> of the hassle of having to whack improvements into a rotten framework.
>>
>> I'll toy around with that in the next days/weeks.  Hang on...
>
> Does it mean this would allow for easier switch to a different registry
> format?  Or have you already settled on some specific format which would
> be part of this another bootstrap mechanism?

The question of on-disk data format (for both type and service 
information) would be rather orthogonal to this work.  There's always a 
trade-off between simplicity (reading in all data from disk upfront, 
creating a complete in-memory model from it) and a potential performance 
gain (reading in data only when it becomes necessary; but that may be a 
fallacy: often enough, a large part of the data is necessary during 
bootstrap, anyway, and structuring the on-disk data for delayed access 
can have negative performance implications).  And that trade-off tends 
to influence the overall design somewhat.  But apart from that, these 
things should be pretty separate.

Stephan


More information about the LibreOffice mailing list