Stephan Bergmann sbergman at redhat.com
Wed Feb 22 04:42:54 PST 2012

On 02/22/2012 11:25 AM, Michael Meeks wrote:
> 	Great ! :-) incidentally, I had one minor point around the ASCII vs.
> UTF-8 side; the rtl_string2UString (cf. sal/rtl/source/string.cxx) does
> a typically slower UTF-8 length counting loop; I suggest that we could
> do better performance wise (and we do create a biggish scad of these
> strings) by sticking with ascii, and doing a single, simple copy/expand
> of the string. Perhaps in a new rtl_uString_newFromAsciiL method.

Thinking about it again, the restriction to ASCII could become a 
hindrance in the longer run.  C++11 has provision for UTF-8 string 
literals (u8"..."), but they still have type char const[], so are not 
distinguishable from traditional plain "..." literals via function 
overloading.  So, if we ever wanted to extend the new facilities to also 
support UTF-8 string literals, but would want to keep the performance 
benefit for the ASCII-only case, we could not offer the same simple syntax


for both.  One solution might be to go via an indirection

   template<std::size_t N> struct A { char const s[N]; }
   template<std::size_t N> struct U { char const s[N]; }

that encodes the knowledge whether the given string literal is ASCII or 
UTF-8, and have rtl::OUString ctors overloaded on those.  Of course, 
this would bring back ugly warts into client code


And of course it would also work to syntactically optimize the ASCII 
case (as we would do now) and add the indirection only for the UTF-8 
case (at the expense of some ugly asymmetry).

Just some thoughts,

More information about the LibreOffice mailing list