o3tl::make_unsigned

Wed Jan 29 14:20:17 UTC 2020

On Wednesday 29 of January 2020, Stephan Bergmann wrote:
> On 29/01/2020 11:41, Luboš Luňák wrote:
> > On Wednesday 29 of January 2020, Stephan Bergmann wrote:
> >>     if (o3tl::make_unsigned(e1) < e2) ...    // (C)
> >>
> >> instead of (B), avoiding an explicit cast and making the intent clear.
> >> (o3tl::make_unsigned is defined "header-only", so can be used everywhere
> >> LIBO_INTERNAL_ONLY is defined.)
> >>
> >> The caveat is that e1 must be known to be non-negative (and
> >> o3tl::make_unsigned asserts that).
> >
> >   That is quite a caveat, to assume that such comparisons of such two
> > values never include one of them being negative. I think the problem is
> > already in
>
> There is no assumptions here?  o3tl::make_unsigned can be used iff its
> argument is known to be non-negative.

 Which is the assumption. Using o3tl::make_signed would not require such an 
assumption, or it would be even less likely false.

> (Which is quite often the case, 
> e.g., because there is a preceding
>
>    if (e1 >= 0) ...
>
> or because it is the result of OUString::getLength etc.)

 That still requires reading the code, and people may skip that and simply use 
the function because "that's how it's done" and then miss corner cases.

>
> > casting to unsigned, which silently can make valid comparisons change
> > result, your wrapper will just detect and abort on that. Technically it's
> > the
>
> As you note in the following sentence, there are cases where there is no
> real alternative to promoting the comparison to an unsigned type.

 Where? A signed type can always hold all values of an unsigned type, if large 
enough, but it's not true the other way around, so it's the signed types that 
should be preferred.

> > unsigned value that should be cast to signed, of higher resolution if
> > needed (which it often is not, given that these often come from things
> > like containers operating on size_t that never grow big enough to hold
> > values that do not fit plain int).

 This IOW says that if you cast the unsigned to signed, you're less likely to 
change the value than if you cast signed to unsigned. A negative value in 
these comparisons may be unlikely, but unsigned values with the highest bit 
used are extremely unlikely.

 For reference, both Java's and C#'s List classes use int for size and index 
types, and use long for file size and position types. Apparently it does the 
job.

 Reading my first mail I probably wasn't clear on what it is that I want, so 
let me clarify: What I'm saying is that there should be o3tl::make_signed() 
instead of make_unsigned(). That'll both move us towards the better types, 
and it'll be safer - it's less likely to assert on unsigned type having the 
highest bit set than on a signed type being negative, and it doesn't need 
analyzing the code the way >=0 does.

> >   The real fix to these problems is simply to use int, unless there's a
> > good reason to use something else. If the idea is to fix the codebase,
> > the idea may be just as well to fix it in a way that doesn't just paper
> > over. So while I like the idea of fixing this, I disagree with the
> > direction taken, both short-term and long-term.
>
> Often, the unsigned types come from APIs we do not control.

 That's unfortunate, but just because we can't be perfect that doesn't mean we 
can't be at least moving in that direction, instead of moving in the wrong 
direction that cannot even get close to being perfect. You can't write code 
without signed types, but you can without unsigned[*], and there will always 
be more signed variables, and converting unsigned->signed is safer => we 
should prefer signed whenever possible.

[*] Barring the bitfield etc. stuff that's not relevant here.

-- 
 Luboš Luňák
 l.lunak at collabora.com