[BUG] etnaviv: broken timeouts

Mon Nov 6 17:05:10 UTC 2017

On Mon, Nov 6, 2017 at 5:18 PM, Russell King - ARM Linux
<linux at armlinux.org.uk> wrote:
> On Mon, Nov 06, 2017 at 04:42:56PM +0100, Arnd Bergmann wrote:
>>
>> > Note that the man page for clock_gettime() doesn't define what
>> > happens to CLOCK_MONOTONIC when it gets to the wrap point.  Neither
>> > does the open group spec issue 7.
>>
>> I'm pretty sure it's not really concerned an issue in practice:
>> CLOCK_REALTIME is guaranteed to wrap before CLOCK_MONOTONIC,
>> and we already rely on that not wrapping.
>
> I don't think you understood my point there properly.
>
> The starting point of CLOCK_MONOTONIC is undefined, so we can't really
> say "it'll always be zero or positive".
>
> With the userspace snippet I gave, if CLOCK_MONOTONIC returns a negative
> tv_sec (eg, because it starts off as a negative number), the
> drm_etnaviv_timespec gets a large 32-bit positive tv_sec which will be
> way in the future, far longer than is expected, even for a short timeout.

I guess I have to reverse my logic above to accommodate for this: we
can't allow any of the clocks to wrap around in practice, because that
would break all kinds of code in unexpected ways. By making
CLOCK_MONOTONIC start at the same value as CLOCK_TAI
or CLOCK_BOOTTIME or anything inbetween, we can be sure that it
doesn't wrap before CLOCK_REALTIME does, and we know when
that happens depending on the architecture specific time_t definition,
which in turn needs to be fixed before it wraps.

> This makes use of timespec64 end up with a timeout way into the future
> and timeouts are far longer (years) than they were intended (maybe
> milliseconds).
>
> So, what should a correct userspace snippet for this look like today
> given that we need to take a 32-bit signed timespec and calculate a
> timestamp for the expiry time and place it into a 64-bit unsigned
> timespec?

I would still argue that the code you posted is safe on Linux because
we won't arbitrarily change the time base for CLOCK_MONOTONIC
to anything that can wrap around in practice, based purely on
the idea that doing so would break not just etnaviv but all kinds of
other code.

In the code you have

foo(struct drm_etnaviv_timespec *ts, unsigned timeout)
{
         struct timespec now;

         clock_gettime(CLOCK_MONOTONIC, &now);

         ts->tv_sec = now.tv_sec + timeout / 1000;
         ts->tv_nsec = now.tv_nsec + (timeout % 1000) * 1000 * 1000;
         if (ts->tv_nsec > 1000 * 1000 * 1000) {
                 ts->tv_nsec -= 1000 * 1000 * 1000;
                 ts->tv_sec += 1;
         }
 }

we add at most 49 days worth of milliseconds, so that's good
based on my assumption above that CLOCK_MONOTONIC
can't overflow before 2038 (or 2085 if you assume it starts
at boot time).

Even if we were to add arbitrary 32-bit seconds (signed
or unsigned), it can still still work fine as long as we don't
degrade to 'timespec' in the kernel ;-)

       Arnd