[BUG] etnaviv: broken timeouts

Lucas Stach l.stach at pengutronix.de
Mon Nov 6 13:40:05 UTC 2017


Hi Arnd,

Am Montag, den 06.11.2017, 13:42 +0100 schrieb Arnd Bergmann:
[...]
> As far as I can tell, the driver does not suffer from an overflow, since
> 
> all the timeouts that get passed from user space are in
> CLOCK_MONOTONIC time, which never overflows a 32-bit seconds
> value unless you run the kernel uninterrupted for 136 years without
> rebooting.
> 
> There is also no problem with the user space ABI, since the
> drm_etnaviv_wait_fence structure uses fixed-length fields
> rather than 'time_t' or 'struct timespec' that will get redefined by
> glibc to use 64-bit timestamps in the next few years. (thanks
> a lot to whoever was paying attention while defining the ioctls!)
> 
> However, there are a few other problems with the code in question:
> 
> - If I understand the earlier comments right, it makes the
>   assumption that 'jiffies' and  'CLOCK_MONOTONIC_RAW'  have
>   the same time base. Generally speaking, 'jiffies' is an implementation
>   detail of the kernel, and user space interfaces should not rely on that.
>   If user space actually uses CLOCK_MONOTONIC, it would be
>   subject to drift, for CLOCK_REALTIME we would never expire,
>   and there is the problem of the INITIAL_JIFFIES constant that
>   I think puts CLOCK_MONOTONIC_RAW 5 minutes ahead of
>   jiffies.
>   I think we should replace references to the global 'jiffies' variable
>   in etnaviv with ktime_get_*() calls to make it safer.  I don't actually
>   recall if we apply the INITIAL_JIFFIES to CLOCK_MONTONIC,
>   if not, then the timeouts are always 5 minutes off...
> 
> - The etnaviv_timeout_to_jiffies() function doesn't work if someone
>   passes a timeout that is more than LONG_MAX jiffies in the
>   future, e.g. if someone was to pass tv_sec=S64_MAX from
>   user space, there is a 50% chance that it returns 0, while the
>   other 50% of time we get a random 32-bit number of jiffies
>   to wait for, depending on how long the system has been running.
>   The expected result would be to wait indefinitely.
> 
> - Using 'timespec' in drivers is generally a bad idea because of the
>   overflow that Russell mentioned. We have already converted
>   hundreds of drivers to use either 'timespec64' or ktime_t instead,
>   which is generally a safer option. I hope to get rid of all uses of
>   timespec in the kernel at some point, and I see that as a prerequisite
>   for shipping products that want to live beyond 2038.
> 
> - less of a problem but still relevant here, my recommendation is
>   to define new user space interfaces for timestamps and timeouts
>   only in terms of 64-bit CLOCK_MONOTONIC nanoseconds,
>   anything else mainly adds complexity but doesn't help much
>   otherwise.
> 
> I have prototyped a patch to address all of the above, will send
> that later today after doing a bit of build testing. I hope you can
> give it some testing, and clarify what the timebase is that
> user space expects to put into the timeout (MONOTONIC,
> MONOTONIC_RAW, or REALTIME).

Thanks. I don't think we spelled it out anywhere, but the timeouts are
supposed to be based on CLOCK_MONOTONIC.

Regards,
Lucas


More information about the etnaviv mailing list