[BUG] etnaviv: broken timeouts
Lucas Stach
l.stach at pengutronix.de
Mon Nov 6 13:40:05 UTC 2017
Hi Arnd,
Am Montag, den 06.11.2017, 13:42 +0100 schrieb Arnd Bergmann:
[...]
> As far as I can tell, the driver does not suffer from an overflow, since
>
> all the timeouts that get passed from user space are in
> CLOCK_MONOTONIC time, which never overflows a 32-bit seconds
> value unless you run the kernel uninterrupted for 136 years without
> rebooting.
>
> There is also no problem with the user space ABI, since the
> drm_etnaviv_wait_fence structure uses fixed-length fields
> rather than 'time_t' or 'struct timespec' that will get redefined by
> glibc to use 64-bit timestamps in the next few years. (thanks
> a lot to whoever was paying attention while defining the ioctls!)
>
> However, there are a few other problems with the code in question:
>
> - If I understand the earlier comments right, it makes the
> assumption that 'jiffies' and 'CLOCK_MONOTONIC_RAW' have
> the same time base. Generally speaking, 'jiffies' is an implementation
> detail of the kernel, and user space interfaces should not rely on that.
> If user space actually uses CLOCK_MONOTONIC, it would be
> subject to drift, for CLOCK_REALTIME we would never expire,
> and there is the problem of the INITIAL_JIFFIES constant that
> I think puts CLOCK_MONOTONIC_RAW 5 minutes ahead of
> jiffies.
> I think we should replace references to the global 'jiffies' variable
> in etnaviv with ktime_get_*() calls to make it safer. I don't actually
> recall if we apply the INITIAL_JIFFIES to CLOCK_MONTONIC,
> if not, then the timeouts are always 5 minutes off...
>
> - The etnaviv_timeout_to_jiffies() function doesn't work if someone
> passes a timeout that is more than LONG_MAX jiffies in the
> future, e.g. if someone was to pass tv_sec=S64_MAX from
> user space, there is a 50% chance that it returns 0, while the
> other 50% of time we get a random 32-bit number of jiffies
> to wait for, depending on how long the system has been running.
> The expected result would be to wait indefinitely.
>
> - Using 'timespec' in drivers is generally a bad idea because of the
> overflow that Russell mentioned. We have already converted
> hundreds of drivers to use either 'timespec64' or ktime_t instead,
> which is generally a safer option. I hope to get rid of all uses of
> timespec in the kernel at some point, and I see that as a prerequisite
> for shipping products that want to live beyond 2038.
>
> - less of a problem but still relevant here, my recommendation is
> to define new user space interfaces for timestamps and timeouts
> only in terms of 64-bit CLOCK_MONOTONIC nanoseconds,
> anything else mainly adds complexity but doesn't help much
> otherwise.
>
> I have prototyped a patch to address all of the above, will send
> that later today after doing a bit of build testing. I hope you can
> give it some testing, and clarify what the timebase is that
> user space expects to put into the timeout (MONOTONIC,
> MONOTONIC_RAW, or REALTIME).
Thanks. I don't think we spelled it out anywhere, but the timeouts are
supposed to be based on CLOCK_MONOTONIC.
Regards,
Lucas
More information about the etnaviv
mailing list