Efficient string concatenation

Lubos Lunak l.lunak at suse.cz
Sun Dec 2 14:56:35 PST 2012


 Hello,

 the feature freeze is just about to arrive, and that's a high time to do 
something that breaks everything and makes things interesting, huh :) ? I've 
written code to make some O(U)String operators more efficient, and unless 
somebody sees a serious problem with it, I'll commit it.

 The work is based on threads [1] and [2] and occassionally seeing in the 
commits that people doing string cleanups sometimes change ugly code to only 
slightly less ugly code. With the new feature enabled, any string 
concatenation/creation is simply done as (well, ok, the number() part is not 
done yet, but shouldn't be difficult to add):

OUString s = foo + bar + "baz" + OUString::number( many ) + "whatever";

All the other alternatives, like explicit OUStringBuffer and repeated append() 
should be now worse in all possible aspects. In fact, this should result in 
just one OUString allocation, one data copy for anything and at most one 
length computation, so it should possibly beat even strcpy+strcat, while at 
the same time looking good.

 I successfully built with gcc (not the ancient Apple one though, I intend to 
pass there again), clang and msvc2010 and passed 'make check'. The resulting 
binary size is about the same (funnily enough it seems that gcc's -Os stops 
it from fully inlining, preventing it from optimizing out more stuff and 
making the code smaller).

 Even though this is in sal/, the intention is to keep this code LO-internal, 
so there won't be any BIC problems, 3rd party apps will keep getting the 
original code. All O(U)String code is inline functions, so there shouldn't be 
any trouble there.

 So as you can see, this would be perfect, if it weren't for some small 
gotchas:
- since operator+ now returns a different object, this is not entirely source 
compatible, and explicit conversions to O(U)String may need to be added 
(e.g. '( a + "b" ).getStr()' -> 'OUString( a + "b" ).getStr()' ). If some of 
those cases would be too annoying, I can try harder to avoid them, but some 
are unavoidable ( ?: operator being one of them and somewhat vexing). However 
the patch 0005 patch fixing all such issues in LO is pretty small, so this 
does not currently seem to be an issue (although that may be because the idea 
of writing simple string-handling code may be catching up slowly).
- as it is template-based, error messages can get somewhat longer, but IMO 
it's nothing horrific. Compilers with decent error reporting are 
recommended :). Alternatively, temporary "#define RTL_DISABLE_FAST_STRING" at 
the top of the source file should help too.

 Still, I think it works pretty well.

[1] 
http://lists.freedesktop.org/archives/libreoffice/2011-November/021156.html
[2] 
http://lists.freedesktop.org/archives/libreoffice/2011-December/022323.html

-- 
 Lubos Lunak
 l.lunak at suse.cz
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-support-for-fast-O-U-String-concatenation-using-oper.patch
Type: text/x-diff
Size: 31519 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20121202/f3efadcd/attachment-0005.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-comphelper-string-ConstAsciiString-support-for-fast-.patch
Type: text/x-diff
Size: 1514 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20121202/f3efadcd/attachment-0006.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-tools-String-support-for-fast-operator.patch
Type: text/x-diff
Size: 2714 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20121202/f3efadcd/attachment-0007.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-make-sure-uno-Any-works-with-fast-operator.patch
Type: text/x-diff
Size: 3113 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20121202/f3efadcd/attachment-0008.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-fixes-for-where-fast-string-operator-is-not-perfectl.patch
Type: text/x-diff
Size: 44359 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20121202/f3efadcd/attachment-0009.patch>


More information about the LibreOffice mailing list