Renaming sal_Unicode to a less misleading name?

Mon Feb 15 16:57:25 UTC 2016

On 02/15/2016 04:37 PM, Norbert Thiebaud wrote:
> On Mon, Feb 15, 2016 at 2:30 AM, Stephan Bergmann <sbergman at redhat.com> wrote:
>> On 02/13/2016 04:21 PM, Norbert Thiebaud wrote:
>>> On Sat, Feb 13, 2016 at 6:17 AM, Khaled Hosny <khaledhosny at eglug.org>
>>> wrote:
>>>>
>>>> I count only ~7000 usages across the code base, so that is not such a
>>>> huge task.
>>>
>>> Internally it is doable, externally that is more of a problem, since
>>> sal_Unicode is part of the stable external API.
>>> The best you can do is to have an internal 'alias' for it.
>>
>>
>> Or the worst, considering that you then confusingly have two names for the
>> same concept.
>
> are you confused by uint<n>_t typedef ?
> they all 'alias' existing type.

I meant that you then have both sal_Unicode (which is part of the stable 
URE interface, so not easily removed) and sal_whatnot, and need to 
explain why there's two of them etc.

And I think that's just not worth the hassle.  It's not too uncommon to 
have a somewhat poorly named type to represent UTF-16 code units (e.g., 
char in Java), so why not just throw up our hands in retrospective 
despair and shrug it off.

>> Trying to encode semantic differences (like between sal_utf16be and sal_utf16le) requires discipline
> indeed
>
>> and when it starts to go sour it's probably worse than not trying to make the distinction in the type system in the first place
>
> yeah, I mentioned the le/be variant to be 'complete', I will certainly
> concede that it would likely be overkill.
> still having sal_utf16. sal_utf32 and even sal_utf8 would not hurt,
> especailly comapred to sal_Int32, sal_Unicode, sal_Char respectively

Instead of introducing yet more typedefs, we'll more and more move to 
C++11 char16_t, char32_t (which already started, now that sal_Unicode is 
a typedef for char16_t, for non-WNT LIBO_INTERNAL).