Efficient string concatenation

Lubos Lunak l.lunak at suse.cz
Mon Dec 3 06:12:29 PST 2012


> > On Sun, Dec 2, 2012 at 4:56 PM, Lubos Lunak <l.lunak at suse.cz> wrote:
> >>  The work is based on threads [1] and [2] and occassionally seeing in
> >> the commits that people doing string cleanups sometimes change ugly code
> >> to only slightly less ugly code. With the new feature enabled, any
> >> string concatenation/creation is simply done as (well, ok, the number()
> >> part is not done yet, but shouldn't be difficult to add):
> >>
> >> OUString s = foo + bar + "baz" + OUString::number( many ) + "whatever";
> >>
> >> All the other alternatives, like explicit OUStringBuffer and repeated
> >> append() should be now worse in all possible aspects.
>
> What is the recommended way to deal with
> for(xxxx)
> {
>       sString += foo;
> }
>
> OUStringBuffer is still the way to go in that case right ?

 Yes. I was referring above to the practice of chaining of append() calls in 
one expression or several consequent statements. If the construction really 
needs to be done step by step, then it still needs to be done this way, I 
don't see any good way around it (except for folding OUStringBuffer 
functionality to OUString one day).

 But thinking of this, I should add support for this fast concat to 
operator+=/append().

On Monday 03 of December 2012, Norbert Thiebaud wrote:
> On Sun, Dec 2, 2012 at 4:56 PM, Lubos Lunak <l.lunak at suse.cz> wrote:
> +        char* end = c.addData( buffer->buffer );
> +        buffer->length = end - buffer->buffer;
> +        pData = buffer;
> +        nCapacity = l + 16;
> ^^^^ how does that work ? you allocate l-bytes but declare a capacity
> of l + 16 ????

 OUStringBuffer always allocates 16 extra characters at the beginning. See 
e.g. the default ctor.

On Monday 03 of December 2012, Norbert Thiebaud wrote:
> +    OStringBuffer( const OStringConcat< T1, T2 >& c )
> +    {
> +        const int l = c.length();
> +        rtl_String* buffer = NULL;
> +        rtl_string_new_WithLength( &buffer, l );
> +        char* end = c.addData( buffer->buffer );
> ^^^
> here the buffer is not 0-terminated...

 It is 0-terminated because of the 0-memset in the _new_WithLength() function. 
But given the comment about that not being necessary, it makes sense to add 
it explicitly.

> vaguely related... since we are talking about performance... why
> *_new_WithLength() in strtmpl.cxx is doing a memset on the whole newly
> allocated buffer...

 Good question. But the string stuff does more things that are not very smart.

-- 
 Lubos Lunak
 l.lunak at suse.cz


More information about the LibreOffice mailing list