Efficient string concatenation
Lubos Lunak
l.lunak at suse.cz
Sun Dec 2 14:56:35 PST 2012
Hello,
the feature freeze is just about to arrive, and that's a high time to do
something that breaks everything and makes things interesting, huh :) ? I've
written code to make some O(U)String operators more efficient, and unless
somebody sees a serious problem with it, I'll commit it.
The work is based on threads [1] and [2] and occassionally seeing in the
commits that people doing string cleanups sometimes change ugly code to only
slightly less ugly code. With the new feature enabled, any string
concatenation/creation is simply done as (well, ok, the number() part is not
done yet, but shouldn't be difficult to add):
OUString s = foo + bar + "baz" + OUString::number( many ) + "whatever";
All the other alternatives, like explicit OUStringBuffer and repeated append()
should be now worse in all possible aspects. In fact, this should result in
just one OUString allocation, one data copy for anything and at most one
length computation, so it should possibly beat even strcpy+strcat, while at
the same time looking good.
I successfully built with gcc (not the ancient Apple one though, I intend to
pass there again), clang and msvc2010 and passed 'make check'. The resulting
binary size is about the same (funnily enough it seems that gcc's -Os stops
it from fully inlining, preventing it from optimizing out more stuff and
making the code smaller).
Even though this is in sal/, the intention is to keep this code LO-internal,
so there won't be any BIC problems, 3rd party apps will keep getting the
original code. All O(U)String code is inline functions, so there shouldn't be
any trouble there.
So as you can see, this would be perfect, if it weren't for some small
gotchas:
- since operator+ now returns a different object, this is not entirely source
compatible, and explicit conversions to O(U)String may need to be added
(e.g. '( a + "b" ).getStr()' -> 'OUString( a + "b" ).getStr()' ). If some of
those cases would be too annoying, I can try harder to avoid them, but some
are unavoidable ( ?: operator being one of them and somewhat vexing). However
the patch 0005 patch fixing all such issues in LO is pretty small, so this
does not currently seem to be an issue (although that may be because the idea
of writing simple string-handling code may be catching up slowly).
- as it is template-based, error messages can get somewhat longer, but IMO
it's nothing horrific. Compilers with decent error reporting are
recommended :). Alternatively, temporary "#define RTL_DISABLE_FAST_STRING" at
the top of the source file should help too.
Still, I think it works pretty well.
[1]
http://lists.freedesktop.org/archives/libreoffice/2011-November/021156.html
[2]
http://lists.freedesktop.org/archives/libreoffice/2011-December/022323.html
--
Lubos Lunak
l.lunak at suse.cz
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-support-for-fast-O-U-String-concatenation-using-oper.patch
Type: text/x-diff
Size: 31519 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20121202/f3efadcd/attachment-0005.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-comphelper-string-ConstAsciiString-support-for-fast-.patch
Type: text/x-diff
Size: 1514 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20121202/f3efadcd/attachment-0006.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-tools-String-support-for-fast-operator.patch
Type: text/x-diff
Size: 2714 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20121202/f3efadcd/attachment-0007.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-make-sure-uno-Any-works-with-fast-operator.patch
Type: text/x-diff
Size: 3113 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20121202/f3efadcd/attachment-0008.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-fixes-for-where-fast-string-operator-is-not-perfectl.patch
Type: text/x-diff
Size: 44359 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20121202/f3efadcd/attachment-0009.patch>
More information about the LibreOffice
mailing list