RTL_CONSTASCII_USTRINGPARAM: cleanup wanted?

Lubos Lunak l.lunak at suse.cz
Wed Feb 22 05:56:12 PST 2012


On Wednesday 22 of February 2012, Stephan Bergmann wrote:
> On 02/22/2012 11:25 AM, Michael Meeks wrote:
> > 	Great ! :-) incidentally, I had one minor point around the ASCII vs.
> > UTF-8 side; the rtl_string2UString (cf. sal/rtl/source/string.cxx) does
> > a typically slower UTF-8 length counting loop; I suggest that we could
> > do better performance wise (and we do create a biggish scad of these
> > strings) by sticking with ascii, and doing a single, simple copy/expand
> > of the string. Perhaps in a new rtl_uString_newFromAsciiL method.

 Actually rtl_string2UString() is reasonably optimized for the case when the 
data is ASCII or UTF-8-that-in-fact-is-ASCII, so the one loop analysing the 
contents is the only overhead. Makes me wonder if avoiding that one loop is 
really worth it. I'll go with 'no' for the time being, until somebody shows 
me otherwise.

> Thinking about it again, the restriction to ASCII could become a
> hindrance in the longer run.  C++11 has provision for UTF-8 string
> literals (u8"..."), but they still have type char const[], so are not
> distinguishable from traditional plain "..." literals via function
> overloading.  So, if we ever wanted to extend the new facilities to also
> support UTF-8 string literals, but would want to keep the performance
> benefit for the ASCII-only case, we could not offer the same simple syntax
>
>    rtl::OUString("foo");
>    rtl::OUString(u8"I\u2764C++");
>
> for both.

 We could have OUString::fromUtf8( utf8literal ), which I consider acceptable, 
especially given that IMO we are unlikely to have a larger number of utf8 
literals anyway. But I think it's better to go for utf8 always and optimize 
if we find out it's worth it.

 I thought there could be a way to test string literal contents at 
compile-time, but string literals are not considered to be compile-time 
constants just because the standard says so, so templates can't take them as 
arguments, and while I've eventually found a way to do it, based on 
http://www.macieira.org/blog/2011/07/initialising-an-array-with-cx0x-using-constexpr-and-variadic-templates/ , 
see attachment, it turns out to be unusable in practice. Maybe later.

-- 
 Lubos Lunak
 l.lunak at suse.cz
-------------- next part --------------
A non-text attachment was scrubbed...
Name: compile_time_analysis.cpp
Type: text/x-c++src
Size: 1962 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20120222/2f455679/attachment.cpp>


More information about the LibreOffice mailing list