[PATCH wayland] test: Add test showing blocking problem when updating timers
Andrew Wedgbury
andrew.wedgbury at realvnc.com
Fri Apr 25 03:36:57 PDT 2014
I've noticed a blocking problem in Wayland's event-loop code when updating
timer event sources. The problem occurs if you update the timer at a point
after is has expired, but before it has been dispatched, i.e. from an event
callback that happens during the same epoll wakeup.
When the timer is subsequently dispatched, wl_event_source_timer_dispatch
blocks for the duration of the new timeout in its call to read() from the
timer fd (which is the expected behaviour according to the man page for
timerfd_settime).
This isn't too uncommon a scenario - for example, a socket with an associated
timeout timer. You'd typically want to update the timer when reading from the
socket. This is how I noticed the issue, since I was setting a timeout of
1 minute, and saw my server blocking for this duration!
The following patch adds a (currently failing) test case to Wayland's
event-loop-test.c. It demonstrates the problem using two timers, which are
set to expire at the same time. The first timer to receive its expiry
callback updates the other timer with a much larger timeout, which then
causes the test to block for this timeout before calling the second timer's
callback.
As for a fix, I'm not so sure (which is why I thought I'd post the failing
test case first to show what I mean). I notice that it doesn't actually do
anything with the value read from the timerfd socket, which gives the number
of times the timer expired since the last read, or when the timer was last
updated (which blocks if the timer hasn't yet expired). I believe this value
should always read as 1 anyway, since we don't use periodic timers.
A simple fix would be to use the TFD_NONBLOCK option when creating the
timerfd, ensuring that the read call won't block. We'd then have to ignore
the case when the read returns EAGAIN.
---
tests/event-loop-test.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 74 insertions(+)
diff --git a/tests/event-loop-test.c b/tests/event-loop-test.c
index f250ed0..1218163 100644
--- a/tests/event-loop-test.c
+++ b/tests/event-loop-test.c
@@ -25,6 +25,7 @@
#include <assert.h>
#include <unistd.h>
#include <signal.h>
+#include <sys/time.h>
#include "wayland-private.h"
#include "wayland-server.h"
@@ -199,6 +200,79 @@ TEST(event_loop_timer)
wl_event_loop_destroy(loop);
}
+struct timer_update_context {
+ struct wl_event_source *source1, *source2;
+ int count;
+};
+
+static int
+timer_update_callback_1(void *data)
+{
+ struct timer_update_context *context = data;
+
+ context->count++;
+ wl_event_source_timer_update(context->source2, 1000);
+ return 1;
+}
+
+static int
+timer_update_callback_2(void *data)
+{
+ struct timer_update_context *context = data;
+
+ context->count++;
+ wl_event_source_timer_update(context->source1, 1000);
+ return 1;
+}
+
+TEST(event_loop_timer_updates)
+{
+ struct wl_event_loop *loop = wl_event_loop_create();
+ struct wl_event_source *source;
+ struct timer_update_context context;
+ struct timeval start_time, end_time, interval;
+
+ /* Create two timers that should expire at the same time (after 10ms).
+ * The first timer to receive its expiry callback updates the other timer
+ * with a much larger timeout (1s). This highlights a bug where
+ * wl_event_source_timer_dispatch would block for this larger timeout
+ * when reading from the timer fd, before calling the second timer's
+ * callback.
+ */
+
+ context.source1 = wl_event_loop_add_timer(loop, timer_update_callback_1,
+ &context);
+ assert(context.source1);
+ wl_event_source_timer_update(context.source1, 10);
+
+ context.source2 = wl_event_loop_add_timer(loop, timer_update_callback_2,
+ &context);
+ assert(context.source2);
+ wl_event_source_timer_update(context.source2, 10);
+
+ context.count = 0;
+
+ gettimeofday(&start_time, NULL);
+ wl_event_loop_dispatch(loop, 20);
+ gettimeofday(&end_time, NULL);
+
+ assert(context.count == 2);
+
+ /* Dispatching the events should not have taken much more than 20ms,
+ * since this is the timeout passed to wl_event_loop_dispatch. If it
+ * blocked, then it will have taken over 1s.
+ * Of course, it could take over 1s anyway on a very slow or heavily
+ * loaded system, so this test isn't 100% perfect.
+ */
+
+ timersub(&end_time, &start_time, &interval);
+ assert(interval.tv_sec < 1);
+
+ wl_event_source_remove(context.source1);
+ wl_event_source_remove(context.source2);
+ wl_event_loop_destroy(loop);
+}
+
struct event_loop_destroy_listener {
struct wl_listener listener;
int done;
--
1.9.2
More information about the wayland-devel
mailing list