optimising OUString for space
Noel Grandin
noel at peralex.com
Mon Oct 1 04:02:11 PDT 2012
On 2012-10-01 12:38, Michael Meeks wrote:
> We could do some magic there; of course - space is a bit of an issue -
> we already pointlessly bloat bazillions of ascii strings into UCS-2
> (nominally UTF-16) representations and nail a ref-count and length on
> the beginning. If you turn on the lifecycle diagnostics in
> sal/rtl/source/strimp.hxx with the #ifdef and re-build sal, you can
> start to see the scale of the problem when you launch libreoffice ;-)
Changing subject because I'm changing the topic.
That was something I was thinking about the other day - given than the
bulk of our strings are pure 7-bit ASCII, it might be a worthwhile
optimisation to store a bit that says "this string is 7-bit ASCII", and
then store the string as a sequence of bytes.
The latest Java VM does this trick internally - it pretends that String
is stored with an array of 16-bit values, but actually it stores them as
UTF-8.
Even in an app running in a language other than US-English, strings are
used for so many internal things that >90% of the strings are 7-bit ASCII.
Disclaimer: http://www.peralex.com/disclaimer.html
More information about the LibreOffice
mailing list