extraordinary stoc/ thrash ...
Stephan Bergmann
sbergman at redhat.com
Tue Apr 3 23:57:16 PDT 2012
On 04/03/2012 11:13 PM, Tomas Hlavaty wrote:
> Some time ago we discussed something along the lines of modernizing rdb
> related code.
>
> I wrote an IDL parser and the java generator unoidl2java is getting
> almost complete. I have a few small patches lined up but I'd like to
> get the java generator as far as I can so that I get a good picture of
> what is it all about. After that I'd like to explore what would be the
> best format for a new rdb registry; maybe binary, maybe text, maybe
> preprocessed idl or preparsed somehow etc.
Great to hear.
>>> but writing a duplicate, much simpler xml parser at a higher level to
>>> read only the new XML .rdb files and whacking them straight into a
>>> hash or two might be a nice easy-hack to have around :-)
>
> I might be missing context here but haven't we discussed some time ago
> (probably with Michael) that speed and size are crucial for the
> registry? XML might not look like a good idea then.
First of all, remember that we have two different scenarios where we
want to replace the old binary rdb format with something else, once for
type information and once for service information.
Second, in this context here (creating a service manager on top of the
available service information at LO start up), it looks like the most
expensive part is the nested XRegistry instances. So that becomes the
natural place to tackle first. (The decision to replace the old binary
rdb format with an XML format for service information was mostly born
out of convenience. In itself, it does not look especially problematic,
performance-wise. While this is certainly open for debate and
inspection, I just don't think it is relevant for the problem at hand.)
>> I think the problem is not the XML parsing, but the nested XRegistry
>> list.
>
> Does "nested XRegistry list" mean the registry structure mirroring the
> symbol hierarchy of uno packages/classes? Why is that a problem?
No. The nesting (or linear list, rather) represents the various files
from which service information is obtained. (And one of the most
important ones, LO's program/services/services.rdb tends to only come
near the very end of that list.)
>> I have a vague idea of placing yet another cppuhelper bootstrap
>> mechanism next to the existing ones, which will internally use
>> completely different (read: cheap) mechanisms to set up a component
>> context and associated service manager. That way, I would avoid most
>> of the hassle of having to whack improvements into a rotten framework.
>>
>> I'll toy around with that in the next days/weeks. Hang on...
>
> Does it mean this would allow for easier switch to a different registry
> format? Or have you already settled on some specific format which would
> be part of this another bootstrap mechanism?
The question of on-disk data format (for both type and service
information) would be rather orthogonal to this work. There's always a
trade-off between simplicity (reading in all data from disk upfront,
creating a complete in-memory model from it) and a potential performance
gain (reading in data only when it becomes necessary; but that may be a
fallacy: often enough, a large part of the data is necessary during
bootstrap, anyway, and structuring the on-disk data for delayed access
can have negative performance implications). And that trade-off tends
to influence the overall design somewhat. But apart from that, these
things should be pretty separate.
Stephan
More information about the LibreOffice
mailing list