Obsoleting RTL_CONSTASCII_USTRINGPARAM

Stephan Bergmann sbergman at redhat.com
Fri Feb 10 01:34:39 PST 2012


On 02/09/2012 06:16 PM, Lubos Lunak wrote:
>   Attached is a patch that removes the need for RTL_CONSTASCII_USTRINGPARAM
> (henceafter called 'macro', because it annoys me to write this even in its
> death warrant) when using the OUString ctor. Assuming this code compiles also
> with MSVC[*], the result should be at least as good as when using the macro,
> except that it is not needed to use the macro.

We tried this back in the day, like maybe ca. 2000, and it did not work 
back then.  Can't remember the details---hopefully it was just old MSVC 
that failed to get it and the MSVCs we rely on today have become better.

> - OString is not included in the patch, as that one does have "const char*"
> ctor. Since it is inline, it actually does not matter for binary
> compatibility, so I can change this, but I didn't want to do this now before
> I get some feedback. It should be later changed as well.

I would assume there's a large number of implicit and explicit uses of 
OString(char const *) with non-literal arguments.

> - I added an explicit check that when converting from an 8-bit string there is
> no \0 inside the string (i.e. that the length really matches), but the check
> actually triggers a couple of times. One case is editeng's ImpEditEngine
> ctor, where I'd expect the problem is real and somebody thought "\0xff"
> was "\xff", but I don't know for sure. Other case is sal's
> test_oustring_endswith.cxx , which intentionally creates OStrings with \0's
> in them and even operates on them, which suggests that OString in fact does
> not stop at \0's when handling strings. I'm kind of baffled by this, I
> personally consider it a flaw that a string class does this, but the unittest
> is very explicit about this. Does somebody know?

I'm kind of baffled by string classes that treat NUL specially.  :)  I 
consider the K&R approach of misusing one of the legitimate member 
values as sentinel as a bad mistake.

>   I however do not see anything wrong with automatic string literal to OUString
> conversions. UI strings, AFAICS, are not included in .cxx files themselves,
> but come from elsewhere, so sources contain only technical strings. Therefore
> it is a very fair assumption that all string literals are ASCII, since
> anything outside of ASCII would presumably need translation anyway, and the
> few legal cases (math symbols not in ASCII maybe) can simply be explicit
> about it (there will be a runtime warning if this is not the case, and since
> it is a string literal, it should be noticed by the developer).

We could change the ctor from using _ASCII_US to _UTF8.  That way, the 
occasional non-ASCII literal could be included as \xXX\xXX\xXX.  This is 
IMO orthogonal to the question of explicit-ness.

>   Moreover in fact the explicit OUString doesn't really buy us anything, as
> nobody says foo's arguments or 'str' really are OUString, so this is about as
> good as Hungarian notation.

Don't understand the above paragraph.

>   So while I can be talked into adding the explicit, I can't find any good
> reason for doing that.

Apart from the argument given by Caolán (which suggests we'd probably 
need to go with explicit, anyway), my gut feeling is that it would be 
better to keep things explicit here.  There's already too many function 
overloads and implicit conversions around wrt to strings to not get the 
bad feeling that each additional one will subtly change some glorious 
code somewhere.

> +    /**
> +     * This overload exists only to avoid creating instances directly from (non-const) char[],
> +     * which would otherwise be picked up by the optimized const char[] constructor.
> +     * Since the non-const array cannot be guaranteed to contain only ASCII characters,
> +     * this needs to be prevented.
> +     *
> +     * It is an error to try to call this overload.
> +     *
> +     * @internal
> +     */
> +    template< int N >

nit: having to pick among int, std::size_t, sal_Int32 for the type of N, 
I would probably use std::size_t

> +    OUString( char (&value)[ N ] )
> +    {
> +#ifndef RTL_STRING_UNITTEST
> +        error_non_const_char_to_OUString_conversion_is_not_automatic( value );
> +#else
> +        (void) value; // unused
> +        pData = 0; // for the unittest create an empty string
> +        rtl_uString_new( &pData );
> +#endif
> +    }

I would turn this into just a declaration, with missing definition, for 
the non-RTL_STRING_UNITTEST case.  (Unless we make the other ctor 
explicit, in which case I would think this one can go away completely.)

Stephan


More information about the LibreOffice mailing list