[BUG] etnaviv: broken timeouts

Mon Nov 6 10:35:01 UTC 2017

+CC Arnd, who seems to care about timeout and year 2038 issues.

Am Freitag, den 03.11.2017, 18:41 +0000 schrieb Russell King - ARM
Linux:
> On Fri, Nov 03, 2017 at 07:07:44PM +0100, Christian Gmeiner wrote:
> > Hi Russell
> > 
> > 
> > > On Wed, Oct 25, 2017 at 10:19:07AM +0100, Russell King - ARM
> > > Linux wrote:
> > > > On Mon, Oct 23, 2017 at 04:21:57PM +0100, Russell King - ARM
> > > > Linux wrote:
> > > > > On Mon, Oct 23, 2017 at 03:57:39PM +0100, Russell King - ARM
> > > > > Linux wrote:
> > > > > > On Thu, Aug 31, 2017 at 02:49:27PM +0200, Lucas Stach
> > > > > > wrote:
> > > > > > > Hi Russell,
> > > > > > > 
> > > > > > > Am Donnerstag, den 31.08.2017, 12:18 +0100 schrieb
> > > > > > > Russell King - ARM
> > > > > > > Linux:
> > > > > > > > I've just stumbled on a bug related to the way we
> > > > > > > > handle fence
> > > > > > > > timeouts.
> > > > > > > > 
> > > > > > > > For DRM_ETNAVIV_WAIT_FENCE, we have:
> > > > > > > > 
> > > > > > > > struct drm_etnaviv_wait_fence {
> > > > > > > >         __u32 pipe;           /* in */
> > > > > > > >         __u32 fence;          /* in */
> > > > > > > >         __u32 flags;          /* in, mask of
> > > > > > > > ETNA_WAIT_x */
> > > > > > > >         __u32 pad;
> > > > > > > >         struct drm_etnaviv_timespec timeout;   /* in */
> > > > > > > > };
> > > > > > > > 
> > > > > > > > where timeout is:
> > > > > > > > 
> > > > > > > > struct drm_etnaviv_timespec {
> > > > > > > >         __s64 tv_sec;          /* seconds */
> > > > > > > >         __s64 tv_nsec;         /* nanoseconds */
> > > > > > > > };
> > > > > > > > 
> > > > > > > > The timeout is with respect to the monotonic clock.  If
> > > > > > > > the timeout is
> > > > > > > > specified far enough in the future, eg:
> > > > > > > > 
> > > > > > > > 9088652.2192296410 now 4793684.242296410
> > > > > > > > 
> > > > > > > > then rather than waiting, the function returns almost
> > > > > > > > immediately with
> > > > > > > > ETIMEDOUT.  The requested timeout is equivalent to
> > > > > > > > (uint32_t)~0
> > > > > > > > milliseconds.
> > > > > > > > 
> > > > > > > > In the kernel, we take the drm_etnaviv_timespec, and
> > > > > > > > stick it into a
> > > > > > > > struct timespec via the TS() macro.  This gets passed
> > > > > > > > to
> > > > > > > > etnaviv_gpu_wait_fence_interruptible(), which uses
> > > > > > > > etnaviv_timeout_to_jiffies() to convert to jiffies.  I
> > > > > > > > suspect that
> > > > > > > > the conversion to jiffies in timespec_to_jiffies()
> > > > > > > > results in a
> > > > > > > > jiffy value that time_after() believes to be before the
> > > > > > > > current time,
> > > > > > > > resulting in ultimately a zero jiffy timeout.
> > > > > > > > 
> > > > > > > > Merely stracing the X server, or adding a fprintf() is
> > > > > > > > enough to
> > > > > > > > avoid the problem.
> > > > > > > > 
> > > > > > > > If you hit this problem, you'll see "fence finish
> > > > > > > > failed" in the Xorg
> > > > > > > > log.
> > > > > > > > 
> > > > > > > > I think doing the time_after() dance after converting
> > > > > > > > to jiffies is
> > > > > > > > wrong: if we're going to have an API that accepts
> > > > > > > > absolute time, then
> > > > > > > > we should handle times that are beyond the ability for
> > > > > > > > us to schedule
> > > > > > > > the wait correctly.
> > > > > > > > 
> > > > > > > > It looks like other APIs that take a timespec or
> > > > > > > > timeval (eg, ppoll(),
> > > > > > > > select(), pselect()) convert the timespec to a ktime
> > > > > > > > value, which
> > > > > > > > limits to KTIME_MAX (see time*_to_ktime() and
> > > > > > > > ktime_set()), which is
> > > > > > > > a much nicer behaviour than that which the etnaviv DRM
> > > > > > > > driver is
> > > > > > > > currently giving us.
> > > > > > > > 
> > > > > > > 
> > > > > > > Are you going to provide a patch for this, or should I
> > > > > > > take a look at
> > > > > > > fixing this?
> > > > > > 
> > > > > > Okay, the good news is that it's finally triggered again,
> > > > > > and I now have
> > > > > > debugging in both the kernel and X server to work out
> > > > > > what's going on.
> > > > > > 
> > > > > > There's a whole host of things wrong here.
> > > > > > 
> > > > > > Here's the debug from one event:
> > > > > > 
> > > > > > X: timed out: to=8868123.581486415 start=4573156.286486415
> > > > > > end=4573156.286602102
> > > > > > 
> > > > > > What this is telling me is that CLOCK_MONOTONIC was
> > > > > > reporting a time of
> > > > > > 4573156.286486415s, which we used as the basis for
> > > > > > calculating the
> > > > > > timeout.  The requested timeout was VIV_WAIT_INDEFINITE
> > > > > > (32-bit ~0 ms).
> > > > > > Hence, the calculated timeout was
> > > > > > 8868123.581486415s.  However, the
> > > > > > kernel reported timeout, and CLOCK_MONOTONIC reported
> > > > > > 4573156.286602102s
> > > > > > immediately after - barely 200us later.
> > > > > > 
> > > > > > From the kernel side, we have:
> > > > > > 
> > > > > > [4573105.970932] timed out: timeout=8868123.581486415
> > > > > > jiffies=1143214071
> > > > > >  remaining=0 fence=1056827139:1056827138
> > > > > > 
> > > > > > The timeout target value corresponds.  This is a 250Hz
> > > > > > kernel, so the
> > > > > > jiffies value corresponds with 1143214071/250s, or
> > > > > > 4572856.284s.
> > > > > > 
> > > > > > The way etnaviv_timeout_to_jiffies() is currently written,
> > > > > > it first
> > > > > > tries to convert the timeout to a number of jiffies.  The
> > > > > > first step
> > > > > > in that is to convert the timeout to a timespec64, and then
> > > > > > to jiffies
> > > > > > using timespec64_to_jiffies().
> > > > > > 
> > > > > > The maximum value that timespec64_to_jiffies() will convert
> > > > > > is 4294966s
> > > > > > on my kernel, gathered by reading the disassembly:
> > > > > > 
> > > > > > 00000268 <timespec64_to_jiffies>:
> > > > > > ...
> > > > > >      274:       e1c020d0        ldrd    r2, [r0]
> > > > > >      278:       e3a01000        mov     r1, #0
> > > > > > ...
> > > > > >      280:       e3080936        movw    r0, #35126      ;
> > > > > > 0x8936
> > > > > >      284:       e3400041        movt    r0, #65 ; 0x41
> > > > > >      288:       e1510003        cmp     r1, r3
> > > > > >      28c:       01500002        cmpeq   r0, r2
> > > > > >      290:       2a000002        bcs     2a0
> > > > > > <timespec64_to_jiffies+0x38>
> > > > > >      294:       e30f0fb6        movw    r0, #65462      ;
> > > > > > 0xffb6
> > > > > >      298:       e3430fff        movt    r0, #16383      ;
> > > > > > 0x3fff
> > > > > >      29c:       e89dadf0        ldm     sp, {r4, r5, r6,
> > > > > > r7, r8, sl, fp, sp, pc}
> > > > > > 
> > > > > > and this returns a maximum of 0x3fffffb6 jiffies, or
> > > > > > 1073741750.
> > > > > > Since the requested timeout is past the maximum number of
> > > > > > seconds,
> > > > > > we return this as "timeout_jiffies".
> > > > > > 
> > > > > > start_jiffies is 1143214071, and we can see immediately
> > > > > > that
> > > > > > time_after(1143214071, 1073741750) is going to be true.
> > > > > > 
> > > > > > This causes remaining_jiffies to be zero, as reported in
> > > > > > the debug
> > > > > > printk as "remaining".
> > > > > > 
> > > > > > What's more is that we can see that CLOCK_MONOTONIC
> > > > > > reported about
> > > > > > 4573156.286486415s, and the jiffies value was 300s in the
> > > > > > past,
> > > > > > reporting 4572856.284s.
> > > > > > 
> > > > > > So, the calculations here are totally bogus.
> > > > > > 
> > > > > > I propose the following (build tested only) to fix this,
> > > > > > but testing
> > > > > > it is going to be very problematical - it's taken 52 days
> > > > > > to reproduce
> > > > > > the original problem.  So, the only sensible way to make
> > > > > > sure that
> > > > > > this is correct is to get many eyes on it and review it
> > > > > > thoroughly.
> > > > > > 
> > > > > > There's a multitude of different time functions, I just
> > > > > > picked here
> > > > > > what looked the most sensible, so this may not be the best
> > > > > > solution
> > > > > > here - maybe Thomas can advise.
> > > > > > 
> > > > > > The other thing to consider is what happens if the
> > > > > > userspace
> > > > > > CLOCK_MONOTONIC value wraps - what should the driver
> > > > > > behaviour be?
> > > > > > Converting the userspace absolute timeout figure to a
> > > > > > timespec64
> > > > > > means that it'll be seen to be in the past, and this
> > > > > > problem will
> > > > > > rear its head again.  Maybe we're better to sticking to the
> > > > > > normal timespec functions so that _our_ idea of monotonic
> > > > > > time
> > > > > > wraps in unison, so the requested timeout continues to work
> > > > > > -
> > > > > > although I'm not convinced timespec*() works across wraps.
> > > > > > 
> > > > > > Other things... it looks to me as if
> > > > > > timespec64_to_jiffies() isn't
> > > > > > doing the right thing - why, on a HZ=250 kernel, does it
> > > > > > limit
> > > > > > jiffies to (~0 / 1000)s and not (~0 / 250)s?  (I'm grateful
> > > > > > that
> > > > > > it did, because if it allowed even twice that figure, I'd
> > > > > > be
> > > > > > looking at a 104 day reproduction time for this issue.)
> > > > > > 
> > > > > > In any case, what this IMHO shows is that dealing with
> > > > > > timeouts in
> > > > > > terms of tv_sec/tv_nsec values is frought with problems -
> > > > > > and if
> > > > > > every driver is needing to open code some kind of absolute
> > > > > > timeout
> > > > > > to jiffies conversion to use in the kernel's standard wait
> > > > > > functions,
> > > > > > I can see many could fall into similar traps.  I think
> > > > > > either we
> > > > > > need to discourage the use of tv_sec/tv_nsec absolute
> > > > > > timeout APIs
> > > > > > or we need some way to help drivers get this calculation
> > > > > > correct.
> > 
> > Thanks for this detailed analysis! I am always quite impressed by
> > your technical
> > write-ups.
> > 
> > > > > Note: although drm_etnaviv_timespec supports 64-bit timespec,
> > > > > there's
> > > > > no way to set it to a valid 64-bit timespec, since
> > > > > clock_gettime() in
> > > > > userspace for a 32-bit arch won't return 64-bit time.  So,
> > > > > while it
> > > > > looks like it would be sane to convert etnaviv's timespec
> > > > > pointers to
> > > > > timespec64, that solves nothing (in fact, I believe it makes
> > > > > it worse.)
> > > > > 
> > > > > I do think the patch below should not be using the 64-bit
> > > > > timespec
> > > > > until the last conversion to jiffies.
> > > > 
> > > > Here's the "normal" timespec version, which I think is
> > > > preferable to
> > > > the timespec64 version:
> > > > 
> > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.h
> > > > b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
> > > > index e41f38667c1c..a435f2c7309a 100644
> > > > --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.h
> > > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
> > > > @@ -143,19 +143,29 @@ static inline bool fence_after_eq(u32 a,
> > > > u32 b)
> > > >       return (s32)(a - b) >= 0;
> > > >  }
> > > > 
> > > > +/*
> > > > + * Etnaviv timeouts are specified wrt CLOCK_MONOTONIC, not
> > > > jiffies.
> > > > + * We need to calculate the timeout in terms of number of
> > > > jiffies
> > > > + * between the specified timeout and the current
> > > > CLOCK_MONOTONIC time.
> > > > + * Note: clock_gettime() is 32-bit on 32-bit arch.  Using 64-
> > > > bit
> > > > + * timespec math here just means that when a wrap occurs, the
> > > > + * specified timeout goes into the past and we can't request a
> > > > + * timeout in the future: IOW, the code breaks.
> > > > + */
> > > >  static inline unsigned long etnaviv_timeout_to_jiffies(
> > > >       const struct timespec *timeout)
> > > >  {
> > > > -     unsigned long timeout_jiffies =
> > > > timespec_to_jiffies(timeout);
> > > > -     unsigned long start_jiffies = jiffies;
> > > > -     unsigned long remaining_jiffies;
> > > > +     struct timespec ts;
> > > > +
> > > > +     ktime_get_ts(&ts);
> > > > +
> > > > +     /* timeouts before "now" have already expired */
> > > > +     if (timespec_compare(timeout, &ts) <= 0)
> > > > +             return 0;
> > > > 
> > > > -     if (time_after(start_jiffies, timeout_jiffies))
> > > > -             remaining_jiffies = 0;
> > > > -     else
> > > > -             remaining_jiffies = timeout_jiffies -
> > > > start_jiffies;
> > > > +     ts = timespec_sub(*timeout, ts);
> > > > 
> > > > -     return remaining_jiffies;
> > > > +     return timespec_to_jiffies(&ts);
> > > >  }
> > > > 
> > > >  #endif /* __ETNAVIV_DRV_H__ */
> > > > 
> > > > 
> > 
> > If you send this change as patch you can add my
> > Reviewed-by: Christian Gmeiner <christian.gmeiner at gmail.com>
> > 
> > Hopefully Lucas will pick it up your patch soon.
> 
> I would really like to hear from Thomas on it due to the problem
> that remains - that is that timespec_compare() will return negative
> if timeout->tv_sec is numerically less than ts.tv_sec.
> 
> So, when we get close to 2037, and our timeout overflows the wrap,
> timeout->tv_sec becomes negative, and is immediately numerically
> less than ts.tv_sec, so this function starts failing.
> 
> It's not just "current time close to 2038" - if we're reading the
> current time and adding (eg) (u32)~0 milliseconds to it (as my
> DDX driver does for "infinite") we'll start seeing problems about
> 50 days prior to the roll over.
> 
> Using the 64-bit math version in the kernel may help push the date
> up to the roll-over date, but doesn't solve the problem.
> 
> I think we have a few years before we need to be concerned about it,
> but I'd like to have a solution.
> 
> Kind'a feels like we need some way to ask the kernel for an
> drm_etnaviv_timespec that is (u64) X nsec into the future - that'd
> also eliminate the "which clock_gettime() clockid should we be using"
> question.

There is a bunch of drivers using absolute timeouts and we probably
need to deal with 32bit systems for a while still. I would really like
to see a solution for this which isn't specific to etnaviv, but adds a
core helper for those situations which has the blessing of the clock
guys. Hopefully Arnd or Thomas can provide some advise.