[BUG] etnaviv: broken timeouts

Mon Oct 23 14:57:39 UTC 2017

On Thu, Aug 31, 2017 at 02:49:27PM +0200, Lucas Stach wrote:
> Hi Russell,
> 
> Am Donnerstag, den 31.08.2017, 12:18 +0100 schrieb Russell King - ARM
> Linux:
> > I've just stumbled on a bug related to the way we handle fence
> > timeouts.
> > 
> > For DRM_ETNAVIV_WAIT_FENCE, we have:
> > 
> > struct drm_etnaviv_wait_fence {
> >         __u32 pipe;           /* in */
> >         __u32 fence;          /* in */
> >         __u32 flags;          /* in, mask of ETNA_WAIT_x */
> >         __u32 pad;
> >         struct drm_etnaviv_timespec timeout;   /* in */
> > };
> > 
> > where timeout is:
> > 
> > struct drm_etnaviv_timespec {
> >         __s64 tv_sec;          /* seconds */
> >         __s64 tv_nsec;         /* nanoseconds */
> > };
> > 
> > The timeout is with respect to the monotonic clock.  If the timeout is
> > specified far enough in the future, eg:
> > 
> > 9088652.2192296410 now 4793684.242296410
> > 
> > then rather than waiting, the function returns almost immediately with
> > ETIMEDOUT.  The requested timeout is equivalent to (uint32_t)~0
> > milliseconds.
> > 
> > In the kernel, we take the drm_etnaviv_timespec, and stick it into a
> > struct timespec via the TS() macro.  This gets passed to
> > etnaviv_gpu_wait_fence_interruptible(), which uses
> > etnaviv_timeout_to_jiffies() to convert to jiffies.  I suspect that
> > the conversion to jiffies in timespec_to_jiffies() results in a
> > jiffy value that time_after() believes to be before the current time,
> > resulting in ultimately a zero jiffy timeout.
> > 
> > Merely stracing the X server, or adding a fprintf() is enough to
> > avoid the problem.
> > 
> > If you hit this problem, you'll see "fence finish failed" in the Xorg
> > log.
> > 
> > I think doing the time_after() dance after converting to jiffies is
> > wrong: if we're going to have an API that accepts absolute time, then
> > we should handle times that are beyond the ability for us to schedule
> > the wait correctly.
> > 
> > It looks like other APIs that take a timespec or timeval (eg, ppoll(),
> > select(), pselect()) convert the timespec to a ktime value, which
> > limits to KTIME_MAX (see time*_to_ktime() and ktime_set()), which is
> > a much nicer behaviour than that which the etnaviv DRM driver is
> > currently giving us.
> > 
> Are you going to provide a patch for this, or should I take a look at
> fixing this?

Okay, the good news is that it's finally triggered again, and I now have
debugging in both the kernel and X server to work out what's going on.

There's a whole host of things wrong here.

Here's the debug from one event:

X: timed out: to=8868123.581486415 start=4573156.286486415 end=4573156.286602102

What this is telling me is that CLOCK_MONOTONIC was reporting a time of
4573156.286486415s, which we used as the basis for calculating the
timeout.  The requested timeout was VIV_WAIT_INDEFINITE (32-bit ~0 ms).
Hence, the calculated timeout was 8868123.581486415s.  However, the
kernel reported timeout, and CLOCK_MONOTONIC reported 4573156.286602102s
immediately after - barely 200us later.

>From the kernel side, we have:

[4573105.970932] timed out: timeout=8868123.581486415 jiffies=1143214071
 remaining=0 fence=1056827139:1056827138

The timeout target value corresponds.  This is a 250Hz kernel, so the
jiffies value corresponds with 1143214071/250s, or 4572856.284s.

The way etnaviv_timeout_to_jiffies() is currently written, it first
tries to convert the timeout to a number of jiffies.  The first step
in that is to convert the timeout to a timespec64, and then to jiffies
using timespec64_to_jiffies().

The maximum value that timespec64_to_jiffies() will convert is 4294966s
on my kernel, gathered by reading the disassembly:

00000268 <timespec64_to_jiffies>:
...
     274:       e1c020d0        ldrd    r2, [r0]
     278:       e3a01000        mov     r1, #0
...
     280:       e3080936        movw    r0, #35126      ; 0x8936
     284:       e3400041        movt    r0, #65 ; 0x41
     288:       e1510003        cmp     r1, r3
     28c:       01500002        cmpeq   r0, r2
     290:       2a000002        bcs     2a0 <timespec64_to_jiffies+0x38>
     294:       e30f0fb6        movw    r0, #65462      ; 0xffb6
     298:       e3430fff        movt    r0, #16383      ; 0x3fff
     29c:       e89dadf0        ldm     sp, {r4, r5, r6, r7, r8, sl, fp, sp, pc}

and this returns a maximum of 0x3fffffb6 jiffies, or 1073741750.
Since the requested timeout is past the maximum number of seconds,
we return this as "timeout_jiffies".

start_jiffies is 1143214071, and we can see immediately that
time_after(1143214071, 1073741750) is going to be true.

This causes remaining_jiffies to be zero, as reported in the debug
printk as "remaining".

What's more is that we can see that CLOCK_MONOTONIC reported about
4573156.286486415s, and the jiffies value was 300s in the past,
reporting 4572856.284s.

So, the calculations here are totally bogus.

I propose the following (build tested only) to fix this, but testing
it is going to be very problematical - it's taken 52 days to reproduce
the original problem.  So, the only sensible way to make sure that
this is correct is to get many eyes on it and review it thoroughly.

There's a multitude of different time functions, I just picked here
what looked the most sensible, so this may not be the best solution
here - maybe Thomas can advise.

The other thing to consider is what happens if the userspace
CLOCK_MONOTONIC value wraps - what should the driver behaviour be?
Converting the userspace absolute timeout figure to a timespec64
means that it'll be seen to be in the past, and this problem will
rear its head again.  Maybe we're better to sticking to the
normal timespec functions so that _our_ idea of monotonic time
wraps in unison, so the requested timeout continues to work -
although I'm not convinced timespec*() works across wraps.

Other things... it looks to me as if timespec64_to_jiffies() isn't
doing the right thing - why, on a HZ=250 kernel, does it limit
jiffies to (~0 / 1000)s and not (~0 / 250)s?  (I'm grateful that
it did, because if it allowed even twice that figure, I'd be
looking at a 104 day reproduction time for this issue.)

In any case, what this IMHO shows is that dealing with timeouts in
terms of tv_sec/tv_nsec values is frought with problems - and if
every driver is needing to open code some kind of absolute timeout
to jiffies conversion to use in the kernel's standard wait functions,
I can see many could fall into similar traps.  I think either we
need to discourage the use of tv_sec/tv_nsec absolute timeout APIs
or we need some way to help drivers get this calculation correct.

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.h b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
index e41f38667c1c..8e575a8e98f6 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
@@ -27,6 +27,7 @@
 #include <linux/slab.h>
 #include <linux/list.h>
 #include <linux/iommu.h>
+#include <linux/time64.h>
 #include <linux/types.h>
 #include <linux/sizes.h>
 
@@ -143,19 +144,27 @@ static inline bool fence_after_eq(u32 a, u32 b)
 	return (s32)(a - b) >= 0;
 }
 
+/*
+ * Etnaviv timeouts are specified wrt CLOCK_MONOTONIC, not jiffies.
+ * We need to calculate the timeout in terms of number of jiffies
+ * between the specified timeout and the current CLOCK_MONOTONIC time.
+ */
 static inline unsigned long etnaviv_timeout_to_jiffies(
 	const struct timespec *timeout)
 {
-	unsigned long timeout_jiffies = timespec_to_jiffies(timeout);
-	unsigned long start_jiffies = jiffies;
-	unsigned long remaining_jiffies;
+	struct timespec64 ts, to;
+
+	to = timespec_to_timespec64(*timeout);
+
+	ktime_get_ts64(&ts);
+
+	/* timeouts before "now" have already expired */
+	if (timespec64_compare(&to, &ts) <= 0)
+		return 0;
 
-	if (time_after(start_jiffies, timeout_jiffies))
-		remaining_jiffies = 0;
-	else
-		remaining_jiffies = timeout_jiffies - start_jiffies;
+	ts = timespec64_sub(to, ts);
 
-	return remaining_jiffies;
+	return timespec64_to_jiffies(&ts);
 }
 
 #endif /* __ETNAVIV_DRV_H__ */


-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up