Michael Meeks michael.meeks at suse.com
Wed Feb 22 06:02:56 PST 2012

On Wed, 2012-02-22 at 13:42 +0100, Stephan Bergmann wrote:
> So, if we ever wanted to extend the new facilities to also 
> support UTF-8 string literals, but would want to keep the performance 
> benefit for the ASCII-only case, we could not offer the same simple syntax

	Sure. On the other hand, we have:

git grep RTL_.*ASCII | wc -l

	of this sort of thing that we know are ascii-only, and relatively fewer
utf-8 strings (none that I can think of off hand).

> And of course it would also work to syntactically optimize the ASCII 
> case (as we would do now) and add the indirection only for the UTF-8 
> case (at the expense of some ugly asymmetry).

	I guess so. Of course, I like the idea of making UTF-8 a 1st class
citizen in rtl::OUString-land - it would be nice not to worry so much
about odd-ball character encodings, and assume that all char *'s are
UTF-8 in many ways.

	Of course, it would be even more wonderful, if, with some
template-magic, we could generate static rtl_uString structures that
would end up in the .rodata section, and got heap allocated only when
copied, with utf-8 -> UCS2 conversion on during a (reasonable time)
compile ;-) but that's a pipe-dream I suspect.

	Perhaps better to slowly move entirely to utf-8 strings anyway, which
brings us back to your proposal ;-) it is hard to see a benefit of UCS-2



michael.meeks at suse.com  <><, Pseudo Engineer, itinerant idiot

More information about the LibreOffice mailing list