RTL_CONSTASCII_USTRINGPARAM: cleanup wanted?
Stephan Bergmann
sbergman at redhat.com
Wed Feb 22 04:42:54 PST 2012
On 02/22/2012 11:25 AM, Michael Meeks wrote:
> Great ! :-) incidentally, I had one minor point around the ASCII vs.
> UTF-8 side; the rtl_string2UString (cf. sal/rtl/source/string.cxx) does
> a typically slower UTF-8 length counting loop; I suggest that we could
> do better performance wise (and we do create a biggish scad of these
> strings) by sticking with ascii, and doing a single, simple copy/expand
> of the string. Perhaps in a new rtl_uString_newFromAsciiL method.
Thinking about it again, the restriction to ASCII could become a
hindrance in the longer run. C++11 has provision for UTF-8 string
literals (u8"..."), but they still have type char const[], so are not
distinguishable from traditional plain "..." literals via function
overloading. So, if we ever wanted to extend the new facilities to also
support UTF-8 string literals, but would want to keep the performance
benefit for the ASCII-only case, we could not offer the same simple syntax
rtl::OUString("foo");
rtl::OUString(u8"I\u2764C++");
for both. One solution might be to go via an indirection
template<std::size_t N> struct A { char const s[N]; }
template<std::size_t N> struct U { char const s[N]; }
that encodes the knowledge whether the given string literal is ASCII or
UTF-8, and have rtl::OUString ctors overloaded on those. Of course,
this would bring back ugly warts into client code
rtl::OUString(rtl::A("foo"));
rtl::OUString(rtl::U(u8"I\u2764C++"));
And of course it would also work to syntactically optimize the ASCII
case (as we would do now) and add the indirection only for the UTF-8
case (at the expense of some ugly asymmetry).
Just some thoughts,
Stephan
More information about the LibreOffice
mailing list