restrictness of strtoi(3bsd) and strtol(3)

Amol Surati suratiamol at gmail.com
Sun Dec 3 15:38:22 UTC 2023


Hello Alex,

On Sun, 3 Dec 2023 at 17:05, Alejandro Colomar <alx at kernel.org> wrote:
>
> Hello Amol,
>
> On Sun, Dec 03, 2023 at 04:29:07PM +0530, Amol Surati wrote:
> > The section "7. Library" at [1] has some information about the 'restrict'
> > keyword.
> >
> > I think the restrict keywords compel the programmer to keep the string
> > (or that portion of the string that strtol actually accesses) and the
> > pointer to a string in non-overlapping memory regions. Calling
> > strtol(p, &p, 0) should be well-defined in such cases.
>
> That would justify the restrict on char **restrict, but it doesn't
> justify the const char *restrict.
> I think a more appropriate prototype would be
>
>         long
>         strtol(const char *nptr, char **restrict endptr, int base);
>
> The above means that endptr points to memory that is not pointed to by
> anything else in this call.
Referring to the points you make later, removing the restrict-qualifier from
nptr then explicitly permits *endptr and nptr to alias, as the types are now
devoid of restrict-qualifiers.

>
> But any of the following is somewhere between confusing and a lie:
>
>         long
>         strtol(const char *nptr,
>                char *restrict *restrict endptr,
>                int base);
>
>         long
>         strtol(const char *restrict nptr,
>                char **restrict endptr,
>                int base);
>
>         long
>         strtol(const char *restrict nptr,
>                char *restrict *restrict endptr,
>                int base);
>
> These 3 from above all mean the same thing: nptr, endptr, and *endptr
> each point to a different memory region.  That's of course a lie,
> because nptr and *endptr may alias.  The formal definition by ISO C,
> which is in terms of accesses, seems to be compatible with these uses of
> restrict, because as long as the function doesn't access the memory, it
> doesn't matter if they overlap.  However, that definition of restrict is
> useless IMO, and still doesn't justify why the compiler isn't
> complaining at call site, where it can't know that strtol(3) won't
> use **endptr.
I think I understand. Since strtol is an external function, the compiler, when
when compiling strtol(p, &p, 0), has enough information, in the form of the
strtol prototype and a call to it, to warn about the fact that nptr and *endptr
may alias in a way that triggers an undefined behaviour.

Based on how I understood the latest draft n3096.pdf, it is the write to a
char through *endptr (along with a read of that char through nptr) that
triggers the violation of the 'restrict' clause. The read and write need not
be in a particular order. No major compiler warns, though, as evident by
an example at https://godbolt.org/z/a4xza5xna
------
What sort of optimizations can a strtol implementation hope to achieve?
A couple of libcs discard the restrict qualifier when calling their handlers
for strtol. The situation with strtol doesn't seem to be similar to that with
memcpy-memmove.

It seems that, as long as strtol does not assign a value to **endptr, it
continues to adhere to the std.

The historical docs point towards a decision to stamp the prototype with
restrict under the assumption that (1) the string and the pointer to string
are in disjoint memory locations, and (2) the implementations would
use endptr for nothing else other than maintaining a position in the given
string.

-Amol
>
> Cheers,
> Alex
>
> --
> <https://www.alejandro-colomar.es/>


More information about the libbsd mailing list