Obsoleting RTL_CONSTASCII_USTRINGPARAM
Lubos Lunak
l.lunak at suse.cz
Thu Feb 9 09:16:46 PST 2012
On Thursday 09 of February 2012, Lubos Lunak wrote:
> On Thursday 09 of February 2012, Michael Meeks wrote:
> > Personally, I'd -love- someone to rename ~all of these
> > RTL_USTR() or RTL_STR() or somesuch - we have enough pointlessly hard to
> > read and indent coding around the place. Hey - we could even have an:
> >
> > RTL_USTRING("foo")
> >
> > that hid all the:
> >
> > rtl::OUString(RTL_CONSTASCII_USTRINGPARAM("foo"))
> >
> > madness from sight ;-)
>
> It'd be probably almost the same amount of work to get rid of the macro
> completely and add a special ctor to handle string literals (can you
> arrange a HackWeek for me soon ;) ? ).
Actually, no need for a week, this one is reasonably simple. Therefore I
challenge RTL_CONSTASCII_USTRINGPARAM to a fight to the death. Its death,
that is.
Attached is a patch that removes the need for RTL_CONSTASCII_USTRINGPARAM
(henceafter called 'macro', because it annoys me to write this even in its
death warrant) when using the OUString ctor. Assuming this code compiles also
with MSVC[*], the result should be at least as good as when using the macro,
except that it is not needed to use the macro.
Technically, the trick is to use "const char (&)[ N ]' , which ensures that
the parameter passed really will be a const char array, which either will be
a string literal, or a real const char array, which in turn is unlikely to be
anything else than a string literal. If OUString had also "const char*" ctor,
then compiler would actually prefer that one to instantiating the template,
but luckily that's not the case.
Note that the patch only removes the need of the macro for ctor calls, but
there can be further changes that allow string literals for other methods.
That would allow deprecating the C-style "overloads" fooAsciiL() and using
proper overloads.
There are few questions about this:
- Technical flaws: As can be seen in the unittest, the only problem is
having "const char [][ N ]" array, where the length for all literals will be
considered to be N-1 and therefore possibly wrong. This is however not used,
and SAL_N_ELEMENTS() (and thus the macro) has the same flaw, so I don't
consider this to be a problem in practice.
- OString is not included in the patch, as that one does have "const char*"
ctor. Since it is inline, it actually does not matter for binary
compatibility, so I can change this, but I didn't want to do this now before
I get some feedback. It should be later changed as well.
- The String class in tools is obsolete AFAIK, and so I do not see the need to
bother with that one.
- I added an explicit check that when converting from an 8-bit string there is
no \0 inside the string (i.e. that the length really matches), but the check
actually triggers a couple of times. One case is editeng's ImpEditEngine
ctor, where I'd expect the problem is real and somebody thought "\0xff"
was "\xff", but I don't know for sure. Other case is sal's
test_oustring_endswith.cxx , which intentionally creates OStrings with \0's
in them and even operates on them, which suggests that OString in fact does
not stop at \0's when handling strings. I'm kind of baffled by this, I
personally consider it a flaw that a string class does this, but the unittest
is very explicit about this. Does somebody know?
- The biggest question probably : I have not made the ctor explicit. That
means that the string literal to OUString conversion is automatic.
I however do not see anything wrong with automatic string literal to OUString
conversions. UI strings, AFAICS, are not included in .cxx files themselves,
but come from elsewhere, so sources contain only technical strings. Therefore
it is a very fair assumption that all string literals are ASCII, since
anything outside of ASCII would presumably need translation anyway, and the
few legal cases (math symbols not in ASCII maybe) can simply be explicit
about it (there will be a runtime warning if this is not the case, and since
it is a string literal, it should be noticed by the developer). Since
non-const data will not trigger this automatic conversion, the room for any
problems caused by the automatic conversion should be minimal and far
outweighted by the convenience. I mean, this way it can be possible to just
do things like
foo( "test", 10, "bar" ) or str = "<tag>"
instead of
foo( OUString( "test" ), 10, OUString( "bar" )) or str = OUString( "<tag>" )
Moreover in fact the explicit OUString doesn't really buy us anything, as
nobody says foo's arguments or 'str' really are OUString, so this is about as
good as Hungarian notation.
So while I can be talked into adding the explicit, I can't find any good
reason for doing that.
Comments?
[*] Could somebody please try if the patch compiles with MSVC? Just building
sal/ including the unittest should be enough.
--
Lubos Lunak
l.lunak at suse.cz
-------------- next part --------------
A non-text attachment was scrubbed...
Name: constascii_deprecate.patch
Type: text/x-diff
Size: 8303 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20120209/b85e17e0/attachment-0001.patch>
More information about the LibreOffice
mailing list