optimising OUString for space

Michael Stahl mstahl at redhat.com
Mon Oct 1 05:06:37 PDT 2012


On 01/10/12 13:55, Noel Grandin wrote:
> 
> On 2012-10-01 13:47, Michael Stahl wrote:
>> ... which brings me to another point: in a hypothetical future when we 
>> could efficiently create a UTF8String from a string literal in C++ 
>> without copying the darn thing, what should hypothetical operations to 
>> mutate the string's buffer do?
> 
> We need external iterators that store the byte index of the current 
> code-point as part of their state.
> Update operations would then take that iterator as a parameter, so that 
> they'd know where to start mutating from.
> Then it becomes a case of carefully shuffling the correct number of 
> bytes around, bearing in mind that a unicode code-point may be anything 
> from one to four bytes long.

that's also a problem, but i was thinking of another one: string literal
goes into a write-protected memory map which will cause a segfault on
writes.

> Or are you talking about memory management?
> The current OUString class allocates a new character buffer for every 
> mutation, I assume we'd keep that strategy.

you mean if i have some string and then add !a" "b" "c" to it it will
re-allocate 3 times?  that is too expensive.  there needs to be some
protocol to ensure exclusive ownership of the buffer (which the
OUStringBuffer has automatically) and then whenever it's out of capacity
double the allocation.




More information about the LibreOffice mailing list