[systemd-devel] [PATCH 1/3] shared: add generic IPC barrier
Tom Gundersen
teg at jklm.no
Sun Jul 13 04:30:17 PDT 2014
Couple of random nitpicks below.
On Sun, Jul 13, 2014 at 12:37 PM, David Herrmann <dh.herrmann at gmail.com> wrote:
> The "Barrier" object is a simple inter-process barrier implementation. It
> allows placing synchronization points and waiting for the other side to
> reach it. Additionally, it has an abortion-mechanism as second-layer
> synchronization to send abortion-events asynchronously to the other side.
>
> The API is usually used to synchronize processes during fork(). However,
> it can be extended to pass state through execve() so you could synchronize
> beyond execve().
>
> Usually, it's used like this (error-handling replaced by assert() for
> simplicity):
>
> Barrier b;
>
> r = barrier_init(&b);
> assert_se(r >= 0);
>
> pid = fork();
> assert_se(pid >= 0);
> if (pid == 0) {
> barrier_set_role(&b, BARRIER_CHILD);
>
> ...do child post-setup...
> if (CHILD_SETUP_FAILED)
> exit(1);
> ...child setup done...
>
> barrier_place(&b);
> if (!barrier_sync(&b)) {
> /* parent setup failed */
> exit(1);
> }
>
> barrier_destroy(&b); /* redundant as execve() and exit() imply this */
>
> /* parent & child setup successful */
> execve(...);
> }
>
> barrier_set_role(&b, BARRIER_PARENT);
>
> ...do parent post-setup...
> if (PARENT_SETUP_FAILED) {
> barrier_abort(&b); /* send abortion event */
> barrier_wait_abortion(&b); /* wait for child to abort (exit() implies abortion) */
> barrier_destroy(&b);
> ...bail out...
> }
> ...parent setup done...
>
> barrier_place(&b);
> if (!barrier_sync(&b)) {
> ...child setup failed... ;
> barrier_destroy(&b);
> ...bail out...
> }
>
> barrier_destroy(&b);
>
> ...child setup successfull...
>
> This is the most basic API. Using barrier_place() to place barriers and
> barrier_sync() to perform a full synchronization between both processes.
> barrier_abort() places an abortion barrier which superceeds any other
> barriers, exit() (or barrier_destroy()) places an abortion-barrier that
> queues behind existing barriers (thus *not* replacing existing barriers
> unlike barrier_abort()).
>
> This example uses hard-synchronization with wait_abortion(), sync() and
> friends. These are all optional. Barriers are highly dynamic and can be
> used for one-way synchronization or even no synchronization at all
> (postponing it for later). The sync() call performs a full two-way
> synchronization.
>
> The API is documented and should be fairly self-explanatory. A test-suite
> shows some special semantics regarding abortion, wait_next() and exit().
>
> Internally, barriers use two eventfds and a pipe. The pipe is used to
> detect exit()s of the remote side as eventfds do not allow that. The
> eventfds are used to place barriers, one for each side. Barriers itself
> are numbered, but the numbers are reused once both sides reached the same
> barrier, thus you cannot address barriers by the index. Moreover, the
> numbering is implicit and we only store a counter. This makes the
> implementation itself very lightweight, which is probably negligible
> considering that we need 3 FDs for a barrier..
>
> Last but not least: This barrier implementation is quite heavy. It's
> definitely not meant for fast IPC synchronization. However, it's very easy
> to use. And given the *HUGE* overhead of fork(), the barrier-overhead
> should be negligible.
> ---
> .gitignore | 1 +
> Makefile.am | 9 +
> src/shared/barrier.c | 442 ++++++++++++++++++++++++++++++++++++++++++++++
> src/shared/barrier.h | 97 ++++++++++
> src/test/test-barrier.c | 460 ++++++++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 1009 insertions(+)
> create mode 100644 src/shared/barrier.c
> create mode 100644 src/shared/barrier.h
> create mode 100644 src/test/test-barrier.c
>
> diff --git a/.gitignore b/.gitignore
> index 31cd8f8..5289f0e 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -123,6 +123,7 @@
> /tags
> /test-architecture
> /test-async
> +/test-barrier
> /test-boot-timestamp
> /test-bus-chat
> /test-bus-cleanup
> diff --git a/Makefile.am b/Makefile.am
> index 2b1484f..039a83e 100644
> --- a/Makefile.am
> +++ b/Makefile.am
> @@ -828,6 +828,8 @@ libsystemd_shared_la_SOURCES = \
> src/shared/login-shared.h \
> src/shared/ring.c \
> src/shared/ring.h \
> + src/shared/barrier.c \
> + src/shared/barrier.h \
> src/shared/async.c \
> src/shared/async.h \
> src/shared/eventfd-util.c \
> @@ -1238,6 +1240,7 @@ tests += \
> test-ellipsize \
> test-util \
> test-ring \
> + test-barrier \
> test-tmpfiles \
> test-namespace \
> test-date \
> @@ -1408,6 +1411,12 @@ test_ring_SOURCES = \
> test_ring_LDADD = \
> libsystemd-core.la
>
> +test_barrier_SOURCES = \
> + src/test/test-barrier.c
> +
> +test_barrier_LDADD = \
> + libsystemd-core.la
> +
> test_tmpfiles_SOURCES = \
> src/test/test-tmpfiles.c
>
> diff --git a/src/shared/barrier.c b/src/shared/barrier.c
> new file mode 100644
> index 0000000..e7b4ead
> --- /dev/null
> +++ b/src/shared/barrier.c
> @@ -0,0 +1,442 @@
> +/*-*- Mode: C; c-basic-offset: 8; indent-tabs-mode: nil -*-*/
> +
> +/***
> + This file is part of systemd.
> +
> + Copyright 2014 David Herrmann <dh.herrmann at gmail.com>
> +
> + systemd is free software; you can redistribute it and/or modify it
> + under the terms of the GNU Lesser General Public License as published by
> + the Free Software Foundation; either version 2.1 of the License, or
> + (at your option) any later version.
> +
> + systemd is distributed in the hope that it will be useful, but
> + WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public License
> + along with systemd; If not, see <http://www.gnu.org/licenses/>.
> +***/
> +
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <limits.h>
> +#include <poll.h>
> +#include <stdbool.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <sys/eventfd.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +
> +#include "barrier.h"
> +#include "macro.h"
> +#include "util.h"
> +
> +/**
> + * Barriers
> + * This barrier implementation provides a simple synchronization method based
> + * on file-descriptors that can safely be used between threads and processes. A
> + * barrier object contains 2 shared counters based on eventfd. Both processes
> + * can now place barriers and wait for the other end to reach a random or
> + * specific barrier.
> + * Barriers are numbered, so you can either wait for the other end to reach any
> + * barrier or the last barrier that you placed. This way, you can use barriers
> + * for one-way *and* full synchronization. Note that even-though barriers are
> + * numbered, these numbers are internal and recycled once both sides reached the
> + * same barrier (implemented as a simple signed counter). It is thus not
> + * possible to address barriers by their ID.
> + *
> + * Barrier-API: Both ends can place as many barriers via barrier_place() as
> + * they want and each pair of barriers on both sides will be implicitly linked.
> + * Each side can use the barrier_wait/sync_*() family of calls to wait for the
> + * other side to place a specific barrier. barrier_wait_next() waits until the
> + * other side calls barrier_place(). No links between the barriers are
> + * considered and this simply serves as most basic asynchronous barrier.
> + * barrier_sync_next() is like barrier_wait_next() and waits for the other side
> + * to place their next barrier via barrier_place(). However, it only waits for
> + * barriers that are linked to a barrier we already placed. If the other side
> + * already placed more barriers than we did, barrier_sync_next() returns
> + * immediately.
> + * barrier_sync() extends barrier_sync_next() and waits until the other end
> + * placed as many barriers via barrier_place() as we did. If they already placed
> + * as many as we did (or more), it returns immediately.
> + *
> + * Additionally to basic barriers, an abortion event is available.
> + * barrier_abort() places an abortion event that cannot be undone. An abortion
> + * immediately cancels all placed barriers and replaces them. Any running and
> + * following wait/sync call besides barrier_wait_abortion() will immediately
> + * return false on both sides (otherwise, they always return true).
> + * barrier_abort() can be called multiple times on both ends and will be a
> + * no-op if already called on this side.
> + * barrier_wait_abortion() can be used to wait for the other side to call
> + * barrier_abort() and is the only wait/sync call that does not return
> + * immediately if we aborted outself. It only returns once the other side
> + * called barrier_abort().
> + *
> + * Barriers can be used for in-process and inter-process synchronization.
> + * However, for in-process synchronization you could just use mutexes.
> + * Therefore, main target is IPC and we require both sides to *not* share the FD
> + * table. If that's given, barriers provide target tracking: If the remote side
> + * exit()s, an abortion event is implicitly queued on the other side. This way,
> + * a sync/wait call will be woken up if the remote side crashed or exited
> + * unexpectedly. However, note that these abortion events are only queued if the
> + * barrier-queue has been drained. Therefore, it is safe to place a barrier and
> + * exit. The other side can safely wait on the barrier even though the exit
> + * queued an abortion event. Usually, the abortion event would overwrite the
> + * barrier, however, that's not true for exit-abortion events. Those are only
> + * queued if the barrier-queue is drained (thus, the receiving side has placed
> + * more barriers than the remote side).
> + */
> +
> +/**
> + * barrier_init() - Initialize a barrier object
> + * @obj: barrier to initialize
> + *
> + * This initializes a barrier object. The caller is responsible of allocating
> + * the memory and keeping it valid. The memory does not have to be zeroed
> + * beforehand.
> + * Two eventfd objects are allocated for each barrier. If allocation fails, an
> + * error is returned.
> + *
> + * If this function fails, the barrier is reset to an invalid state so it is
> + * safe to call barrier_destroy() on the object regardless whether the
> + * initialization succeeded or not.
> + *
> + * The caller is responsible to destroy the object via barrier_destroy() before
> + * releasing the underlying memory.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +int barrier_init(Barrier *obj) {
> + _barrier_destroy_ptr_ Barrier *b = NULL;
> + int r;
> +
> + assert_return(obj, -EINVAL);
> +
> + b = obj;
> + zero(*b);
> +
> + b->me = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
> + if (b->me < 0)
> + return -errno;
> +
> + b->them = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
> + if (b->them < 0)
> + return -errno;
> +
> + r = pipe2(b->pipe, O_CLOEXEC | O_NONBLOCK);
> + if (r < 0)
> + return -errno;
> +
> + b = NULL;
> + return 0;
Usually we don't mangle obj unless the call succeeds. You could just
assign the eventfds and pipe to local variables and then
zero+initialize obj once you know it all worked.
> +}
> +
> +/**
> + * barrier_destroy() - Destroy a barrier object
> + * @b: barrier to destroy or NULL
> + *
> + * This destroys a barrier object that has previously been initialized via
> + * barrier_destroy(). The object is released and reset to invalid state.
barrier_init()?
> + * Therefore, it is safe to call barrier_destroy() multiple times or even if
> + * barrier_init() failed. However, you must not call barrier_destroy() if you
> + * never called barrier_init() on the object before.
> + *
> + * It is safe to initialize a barrier via zero() / memset(.., 0, ...). Even
> + * though it has embedded FDs, barrier_destroy() can deal with zeroed objects
> + * just fine.
> + *
> + * If @b is NULL, this is a no-op.
> + */
> +void barrier_destroy(Barrier *b) {
> + if (!b)
> + return;
> +
> + /* @me and @them cannot be both FD 0. Lets be pedantic and check the
> + * pipes and barriers, too. If all are 0, the object was zero()ed and
> + * is invalid. This allows users to use zero(barrier) to reset the
> + * backing memory. */
> + if (b->me == 0 &&
> + b->them == 0 &&
> + b->pipe[0] == 0 &&
> + b->pipe[1] == 0 &&
> + b->barriers == 0)
> + return;
> +
> + b->me = safe_close(b->me);
> + b->them = safe_close(b->them);
> + b->pipe[0] = safe_close(b->pipe[0]);
> + b->pipe[1] = safe_close(b->pipe[1]);
> + b->barriers = 0;
> +}
> +
> +/**
> + * barrier_set_role() - Set the local role of the barrier
> + * @b: barrier to operate on
> + * @role: role to set on the barrier
> + *
> + * This sets the roles on a barrier object. This is needed to know which
> + * side of the barrier you're on. Usually, the parent creates the barrier via
> + * barrier_init() and then calls fork() or clone(). Therefore, the FDs are
> + * duplicated and the child retains the same barrier object.
> + *
> + * Both sides need to call barrier_set_role() after fork() or clone() are done.
> + * If this is not done, barriers will not work correctly.
> + *
> + * Note that barriers could be supported without fork() or clone(). However,
> + * this is currently not needed so it hasn't been implemented.
> + */
> +void barrier_set_role(Barrier *b, unsigned int role) {
> + int fd;
> +
> + assert(b);
> + assert(role == BARRIER_PARENT || role == BARRIER_CHILD);
> + /* make sure this is only called once */
> + assert(b->pipe[1] >= 0 && b->pipe[1] >= 0);
> +
> + if (role == BARRIER_PARENT) {
> + b->pipe[1] = safe_close(b->pipe[1]);
> + } else {
> + b->pipe[0] = safe_close(b->pipe[0]);
> +
> + /* swap me/them for childs */
children
> + fd = b->me;
> + b->me = b->them;
> + b->them = fd;
> + }
> +}
> +
> +/* places barrier; returns false if we aborted, otherwise true */
> +static bool barrier_write(Barrier *b, uint64_t buf) {
> + ssize_t len;
> +
> + /* prevent new sync-points if we already aborted */
> + if (barrier_i_aborted(b))
> + return false;
> +
> + do {
> + len = write(b->me, &buf, sizeof(buf));
> + } while (len < 0 && (errno == EAGAIN || errno == EINTR));
> +
> + if (len != sizeof(buf))
> + goto error;
> +
> + /* lock if we aborted */
> + if (buf >= (uint64_t)BARRIER_ABORTION) {
> + if (barrier_they_aborted(b))
> + b->barriers = BARRIER_WE_ABORTED;
> + else
> + b->barriers = BARRIER_I_ABORTED;
> + } else if (!barrier_is_aborted(b)) {
> + b->barriers += buf;
> + }
> +
> + return !barrier_i_aborted(b);
> +
> +error:
> + /* If there is an unexpected error, we have to make this fatal. There
> + * is no way we can recover from sync-errors. Therefore, we close the
> + * pipe-ends and treat this as abortion. The other end will notice the
> + * pipe-close and treat it as abortion, too. */
> +
> + b->pipe[0] = safe_close(b->pipe[0]);
> + b->pipe[1] = safe_close(b->pipe[1]);
> + b->barriers = BARRIER_WE_ABORTED;
> + return false;
> +}
> +
> +/* waits for barriers; returns false if they aborted, otherwise true */
> +static bool barrier_read(Barrier *b, int64_t comp) {
> + uint64_t buf;
> + ssize_t len;
> + struct pollfd pfd[2] = { };
> + int r;
> +
> + if (barrier_they_aborted(b))
> + return false;
> +
> + while (b->barriers > comp) {
> + pfd[0].fd = (b->pipe[0] >= 0) ? b->pipe[0] : b->pipe[1];
> + pfd[0].events = POLLHUP;
> + pfd[0].revents = 0;
> + pfd[1].fd = b->them;
> + pfd[1].events = POLLIN;
> + pfd[1].revents = 0;
> +
> + r = poll(pfd, 2, -1);
> + if (r < 0 && (errno == EAGAIN || errno == EINTR))
> + continue;
> + else if (r < 0)
> + goto error;
> +
> + if (pfd[1].revents) {
> + /* events on @them signal us new data */
> + len = read(b->them, &buf, sizeof(buf));
> + if (len < 0 && (errno == EAGAIN || errno == EINTR))
> + continue;
> +
> + if (len != sizeof(buf))
> + goto error;
> + } else if (pfd[0].revents & (POLLHUP | POLLERR | POLLNVAL)) {
> + /* POLLHUP on the pipe tells us the other side exited.
> + * We treat this as implicit abortion. But we only
> + * handle it if there's no event on the eventfd. This
> + * guarantees that exit-abortions do not overwrite real
> + * barriers. */
> + buf = BARRIER_ABORTION;
> + }
> +
> + /* lock if they aborted */
> + if (buf >= (uint64_t)BARRIER_ABORTION) {
> + if (barrier_i_aborted(b))
> + b->barriers = BARRIER_WE_ABORTED;
> + else
> + b->barriers = BARRIER_THEY_ABORTED;
> + } else if (!barrier_is_aborted(b)) {
> + b->barriers -= buf;
> + }
> + }
> +
> + return !barrier_they_aborted(b);
> +
> +error:
> + /* If there is an unexpected error, we have to make this fatal. There
> + * is no way we can recover from sync-errors. Therefore, we close the
> + * pipe-ends and treat this as abortion. The other end will notice the
> + * pipe-close and treat it as abortion, too. */
> +
> + b->pipe[0] = safe_close(b->pipe[0]);
> + b->pipe[1] = safe_close(b->pipe[1]);
> + b->barriers = BARRIER_WE_ABORTED;
> + return false;
> +}
> +
> +/**
> + * barrier_place() - Place a new barrier
> + * @b: barrier object
> + *
> + * This places a new barrier on the barrier object. If either side already
> + * aborted, this is a no-op and returns "false". Otherwise, the barrier is
> + * placed and this returns "true".
> + *
> + * Returns: true if barrier was placed, false if either side aborted.
> + */
> +bool barrier_place(Barrier *b) {
> + assert(b);
> +
> + if (barrier_is_aborted(b))
> + return false;
> +
> + barrier_write(b, BARRIER_SINGLE);
> + return true;
> +}
> +
> +/**
> + * barrier_abort() - Abort the synchronization
> + * @b: barrier object to abort
> + *
> + * This aborts the barrier-synchronization. If barrier_abort() was already
> + * called on this side, this is a no-op. Otherwise, the barrier is put into the
> + * ABORT-state and will stay there. The other side is notified about the
> + * abortion. Any following attempt to place normal barriers or to wait on normal
> + * barriers will return immediately as "false".
> + *
> + * You can wait for the other side to call barrier_abort(), too. Use
> + * barrier_wait_abortion() for that.
> + *
> + * Returns: false if the other side already aborted, true otherwise.
> + */
> +bool barrier_abort(Barrier *b) {
> + assert(b);
> +
> + barrier_write(b, BARRIER_ABORTION);
> + return !barrier_they_aborted(b);
> +}
> +
> +/**
> + * barrier_wait_next() - Wait for the next barrier of the other side
> + * @b: barrier to operate on
> + *
> + * This waits until the other side places its next barrier. This is independent
> + * of any barrier-links and just waits for any next barrier of the other side.
> + *
> + * If either side aborted, this returns false.
> + *
> + * Returns: false if either side aborted, true otherwise.
> + */
> +bool barrier_wait_next(Barrier *b) {
> + assert(b);
> +
> + if (barrier_is_aborted(b))
> + return false;
> +
> + barrier_read(b, b->barriers - 1);
> + return !barrier_is_aborted(b);
> +}
> +
> +/**
> + * barrier_wait_abortion() - Wait for the other side to abort
> + * @b: barrier to operate on
> + *
> + * This waits until the other side called barrier_abort(). This can be called
> + * regardless whether the local side already called barrier_abort() or not.
> + *
> + * If the other side has already aborted, this returns immediately.
> + *
> + * Returns: false if the local side aborted, true otherwise.
> + */
> +bool barrier_wait_abortion(Barrier *b) {
> + assert(b);
> +
> + barrier_read(b, BARRIER_THEY_ABORTED);
> + return !barrier_i_aborted(b);
> +}
> +
> +/**
> + * barrier_sync_next() - Wait for the other side to place a next linked barrier
> + * @b: barrier to operate on
> + *
> + * This is like barrier_wait_next() and waits for the other side to call
> + * barrier_place(). However, this only waits for linked barriers. That means, if
> + * the other side already placed more barriers than (or as much as) we did, this
> + * returns immediately instead of waiting.
> + *
> + * If either side aborted, this returns false.
> + *
> + * Returns: false if either side aborted, true otherwise.
> + */
> +bool barrier_sync_next(Barrier *b) {
> + assert(b);
> +
> + if (barrier_is_aborted(b))
> + return false;
> +
> + barrier_read(b, MAX((int64_t)0, b->barriers - 1));
> + return !barrier_is_aborted(b);
> +}
> +
> +/**
> + * barrier_sync() - Wait for the other side to place as many barriers as we did
> + * @b: barrier to operate on
> + *
> + * This is like barrier_sync_next() but waits for the other side to call
> + * barrier_place() as often as we did (in total). If they already placed as much
> + * as we did (or more), this returns immediately instead of waiting.
> + *
> + * If either side aborted, this returns false.
> + *
> + * Returns: false if either side aborted, true otherwise.
> + */
> +bool barrier_sync(Barrier *b) {
> + assert(b);
> +
> + if (barrier_is_aborted(b))
> + return false;
> +
> + barrier_read(b, 0);
> + return !barrier_is_aborted(b);
> +}
> diff --git a/src/shared/barrier.h b/src/shared/barrier.h
> new file mode 100644
> index 0000000..2d98e0d
> --- /dev/null
> +++ b/src/shared/barrier.h
> @@ -0,0 +1,97 @@
> +/*-*- Mode: C; c-basic-offset: 8; indent-tabs-mode: nil -*-*/
> +
> +#pragma once
> +
> +/***
> + This file is part of systemd.
> +
> + Copyright 2014 David Herrmann <dh.herrmann at gmail.com>
> +
> + systemd is free software; you can redistribute it and/or modify it
> + under the terms of the GNU Lesser General Public License as published by
> + the Free Software Foundation; either version 2.1 of the License, or
> + (at your option) any later version.
> +
> + systemd is distributed in the hope that it will be useful, but
> + WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public License
> + along with systemd; If not, see <http://www.gnu.org/licenses/>.
> +***/
> +
> +#include <errno.h>
> +#include <inttypes.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <sys/types.h>
> +
> +#include "macro.h"
> +#include "util.h"
> +
> +/* See source file for an API description. */
> +
> +typedef struct Barrier Barrier;
> +
> +enum {
> + BARRIER_SINGLE = 1LL,
> + BARRIER_ABORTION = INT64_MAX,
> +
> + /* bias values to store state; keep @WE < @THEY < @I */
> + BARRIER_BIAS = INT64_MIN,
> + BARRIER_WE_ABORTED = BARRIER_BIAS + 1LL,
> + BARRIER_THEY_ABORTED = BARRIER_BIAS + 2LL,
> + BARRIER_I_ABORTED = BARRIER_BIAS + 3LL,
> +};
> +
> +enum {
> + BARRIER_PARENT,
> + BARRIER_CHILD,
> +};
> +
> +struct Barrier {
> + int me;
> + int them;
Maybe use us/them (and not we/they) consistently everywhere?
> + int pipe[2];
> + int64_t barriers;
> +};
> +
> +int barrier_init(Barrier *obj);
> +void barrier_destroy(Barrier *b);
> +
> +#define _barrier_destroy_ _cleanup_(barrier_destroy_on_stack)
> +static inline void barrier_destroy_on_stack(Barrier *b) { if (b) barrier_destroy(b); }
> +#define _barrier_destroy_ptr_ _cleanup_(barrier_destroyp)
> +DEFINE_TRIVIAL_CLEANUP_FUNC(Barrier*, barrier_destroy);
> +
> +void barrier_set_role(Barrier *b, unsigned int role);
> +
> +bool barrier_place(Barrier *b);
> +bool barrier_abort(Barrier *b);
> +
> +bool barrier_wait_next(Barrier *b);
> +bool barrier_wait_abortion(Barrier *b);
> +bool barrier_sync_next(Barrier *b);
> +bool barrier_sync(Barrier *b);
> +
> +static inline bool barrier_i_aborted(Barrier *b) {
> + return b->barriers == BARRIER_I_ABORTED || b->barriers == BARRIER_WE_ABORTED;
> +}
> +
> +static inline bool barrier_they_aborted(Barrier *b) {
> + return b->barriers == BARRIER_THEY_ABORTED || b->barriers == BARRIER_WE_ABORTED;
> +}
> +
> +static inline bool barrier_we_aborted(Barrier *b) {
> + return b->barriers == BARRIER_WE_ABORTED;
> +}
> +
> +static inline bool barrier_is_aborted(Barrier *b) {
> + return b->barriers == BARRIER_I_ABORTED || b->barriers == BARRIER_THEY_ABORTED || b->barriers == BARRIER_WE_ABORTED;
> +}
> +
> +static inline bool barrier_place_and_sync(Barrier *b) {
> + barrier_place(b);
> + return barrier_sync(b);
> +}
> diff --git a/src/test/test-barrier.c b/src/test/test-barrier.c
> new file mode 100644
> index 0000000..640e508
> --- /dev/null
> +++ b/src/test/test-barrier.c
> @@ -0,0 +1,460 @@
> +/*-*- Mode: C; c-basic-offset: 8; indent-tabs-mode: nil -*-*/
> +
> +/***
> + This file is part of systemd.
> +
> + Copyright 2014 David Herrmann <dh.herrmann at gmail.com>
> +
> + systemd is free software; you can redistribute it and/or modify it
> + under the terms of the GNU Lesser General Public License as published by
> + the Free Software Foundation; either version 2.1 of the License, or
> + (at your option) any later version.
> +
> + systemd is distributed in the hope that it will be useful, but
> + WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public License
> + along with systemd; If not, see <http://www.gnu.org/licenses/>.
> +***/
> +
> +/*
> + * IPC barrier tests
> + * These tests verify the correct behavior of the IPC Barrier implementation.
> + * Note that the tests use alarm-timers to verify dead-locks and timeouts. These
> + * might not work on slow machines where 20ms are too short to perform specific
> + * operations (though, very unlikely). In case that turns out true, we have to
> + * increase it at the slightly cost of lengthen test-duration on other machines.
> + */
> +
> +#include <errno.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <sys/time.h>
> +#include <sys/wait.h>
> +#include <unistd.h>
> +
> +#include "barrier.h"
> +#include "def.h"
> +#include "util.h"
> +
> +/* 20ms to test deadlocks; All timings use multiples of this constant as
> + * alarm/sleep timers. If this timeout is too small for slow machines to perform
> + * the requested operations, we have to increase it. On an i7 this works fine
> + * with 1ms base-time, so 20ms should be just fine for everyone. */
> +#define BASE_TIME 20
> +
> +static void malarm(unsigned long msecs) {
> + struct itimerval v = { };
> +
> + timeval_store(&v.it_value, msecs * USEC_PER_MSEC);
> + assert_se(setitimer(ITIMER_REAL, &v, NULL) >= 0);
> +}
> +
> +static void msleep(unsigned long msecs) {
> + assert_se(msecs < MSEC_PER_SEC);
> + usleep(msecs * USEC_PER_MSEC);
> +}
> +
> +#define TEST_BARRIER(_FUNCTION, _CHILD_CODE, _WAIT_CHILD, _PARENT_CODE, _WAIT_PARENT) \
> + static void _FUNCTION(void) { \
> + Barrier b; \
> + pid_t pid1, pid2; \
> + \
> + assert_se(barrier_init(&b) >= 0); \
> + \
> + pid1 = fork(); \
> + assert_se(pid1 >= 0); \
> + if (pid1 == 0) { \
> + barrier_set_role(&b, BARRIER_CHILD); \
> + { _CHILD_CODE; } \
> + exit(42); \
> + } \
> + \
> + pid2 = fork(); \
> + assert_se(pid2 >= 0); \
> + if (pid2 == 0) { \
> + barrier_set_role(&b, BARRIER_PARENT); \
> + { _PARENT_CODE; } \
> + exit(42); \
> + } \
> + \
> + barrier_destroy(&b); \
> + malarm(999); \
> + { _WAIT_CHILD; } \
> + { _WAIT_PARENT; } \
> + malarm(0); \
> + }
> +
> +#define TEST_BARRIER_WAIT_SUCCESS(_pid) \
> + ({ \
> + int pidr, status; \
> + pidr = waitpid(_pid, &status, 0); \
> + assert_se(pidr == _pid); \
> + assert_se(WIFEXITED(status)); \
> + assert_se(WEXITSTATUS(status) == 42); \
> + })
> +
> +#define TEST_BARRIER_WAIT_ALARM(_pid) \
> + ({ \
> + int pidr, status; \
> + pidr = waitpid(_pid, &status, 0); \
> + assert_se(pidr == _pid); \
> + assert_se(WIFSIGNALED(status)); \
> + assert_se(WTERMSIG(status) == SIGALRM); \
> + })
> +
> +/*
> + * Test basic sync points
> + * This places a barrier in both processes and waits synchronously for them.
> + * The timeout makes sure the sync works as expected. The msleep() on one side
> + * makes sure the exit of the parent does not overwrite previous barriers. Due
> + * to the msleep(), we know that the parent already exited, thus there's a
> + * pending HUP on the pipe. However, the barrier_sync() prefers reads on the
> + * eventfd, thus we can safely wait on the barrier.
> + */
> +TEST_BARRIER(test_barrier_sync,
> + ({
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_place(&b));
> + msleep(BASE_TIME * 2);
> + assert_se(barrier_sync(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid1),
> + ({
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_place(&b));
> + assert_se(barrier_sync(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test wait_next()
> + * This places a barrier in the parent and syncs on it. The child sleeps while
> + * the parent places the barrier and then waits for a barrier. The wait will
> + * succeed as the child hasn't read the parent's barrier, yet. The following
> + * barrier and sync synchronize the exit.
> + */
> +TEST_BARRIER(test_barrier_wait_next,
> + ({
> + msleep(100);
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_wait_next(&b));
> + assert_se(barrier_place(&b));
> + assert_se(barrier_sync(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid1),
> + ({
> + malarm(400);
> + assert_se(barrier_place(&b));
> + assert_se(barrier_sync(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test wait_next() multiple times
> + * This places two barriers in the parent and waits for the child to exit. The
> + * child sleeps 20ms so both barriers _should_ be in place. It then waits for
> + * the parent to place the next barrier twice. The first call will fetch both
> + * barriers and return. However, the second call will stall as the parent does
> + * not place a 3rd barrier (the sleep caught two barriers). wait_next() is does
> + * not look at barrier-links so this stall is expected. Thus this test times
> + * out.
> + */
> +TEST_BARRIER(test_barrier_wait_next_twice,
> + ({
> + msleep(BASE_TIME);
> + malarm(BASE_TIME);
> + assert_se(barrier_wait_next(&b));
> + assert_se(barrier_wait_next(&b));
> + assert_se(0);
> + }),
> + TEST_BARRIER_WAIT_ALARM(pid1),
> + ({
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_place(&b));
> + assert_se(barrier_place(&b));
> + msleep(BASE_TIME * 2);
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test wait_next() with local barriers
> + * This is the same as test_barrier_wait_next_twice, but places local barriers
> + * between both waits. This does not have any effect on the wait so it times out
> + * like the other test.
> + */
> +TEST_BARRIER(test_barrier_wait_next_twice_local,
> + ({
> + msleep(BASE_TIME);
> + malarm(BASE_TIME);
> + assert_se(barrier_wait_next(&b));
> + assert_se(barrier_place(&b));
> + assert_se(barrier_place(&b));
> + assert_se(barrier_wait_next(&b));
> + assert_se(0);
> + }),
> + TEST_BARRIER_WAIT_ALARM(pid1),
> + ({
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_place(&b));
> + assert_se(barrier_place(&b));
> + msleep(BASE_TIME * 2);
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test wait_next() with sync_next()
> + * This is again the same as test_barrier_wait_next_twice but uses a
> + * synced wait as the second wait. This works just fine because the local state
> + * has no barriers placed, therefore, the remote is always in sync.
> + */
> +TEST_BARRIER(test_barrier_wait_next_twice_sync,
> + ({
> + msleep(BASE_TIME);
> + malarm(BASE_TIME);
> + assert_se(barrier_wait_next(&b));
> + assert_se(barrier_sync_next(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid1),
> + ({
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_place(&b));
> + assert_se(barrier_place(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test wait_next() with sync_next() and local barriers
> + * This is again the same as test_barrier_wait_next_twice_local but uses a
> + * synced wait as the second wait. This works just fine because the local state
> + * is in sync with the remote.
> + */
> +TEST_BARRIER(test_barrier_wait_next_twice_local_sync,
> + ({
> + msleep(BASE_TIME);
> + malarm(BASE_TIME);
> + assert_se(barrier_wait_next(&b));
> + assert_se(barrier_place(&b));
> + assert_se(barrier_place(&b));
> + assert_se(barrier_sync_next(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid1),
> + ({
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_place(&b));
> + assert_se(barrier_place(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test sync_next() and sync()
> + * This tests sync_*() synchronizations and makes sure they work fine if the
> + * local state is behind the remote state.
> + */
> +TEST_BARRIER(test_barrier_sync_next,
> + ({
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_sync_next(&b));
> + assert_se(barrier_sync(&b));
> + assert_se(barrier_place(&b));
> + assert_se(barrier_place(&b));
> + assert_se(barrier_sync_next(&b));
> + assert_se(barrier_sync_next(&b));
> + assert_se(barrier_sync(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid1),
> + ({
> + malarm(BASE_TIME * 10);
> + msleep(BASE_TIME);
> + assert_se(barrier_place(&b));
> + assert_se(barrier_place(&b));
> + assert_se(barrier_sync(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test sync_next() and sync() with local barriers
> + * This tests timeouts if sync_*() is used if local barriers are placed but the
> + * remote didn't place any.
> + */
> +TEST_BARRIER(test_barrier_sync_next_local,
> + ({
> + malarm(BASE_TIME);
> + assert_se(barrier_place(&b));
> + assert_se(barrier_sync_next(&b));
> + assert_se(0);
> + }),
> + TEST_BARRIER_WAIT_ALARM(pid1),
> + ({
> + msleep(BASE_TIME * 2);
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test sync_next() and sync() with local barriers and abortion
> + * This is the same as test_barrier_sync_next_local but aborts the sync in the
> + * parent. Therefore, the sync_next() succeeds just fine due to the abortion.
> + */
> +TEST_BARRIER(test_barrier_sync_next_local_abort,
> + ({
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_place(&b));
> + assert_se(!barrier_sync_next(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid1),
> + ({
> + assert_se(barrier_abort(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test matched wait_abortion()
> + * This runs wait_abortion() with remote abortion.
> + */
> +TEST_BARRIER(test_barrier_wait_abortion,
> + ({
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_wait_abortion(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid1),
> + ({
> + assert_se(barrier_abort(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test unmatched wait_abortion()
> + * This runs wait_abortion() without any remote abortion going on. It thus must
> + * timeout.
> + */
> +TEST_BARRIER(test_barrier_wait_abortion_unmatched,
> + ({
> + malarm(BASE_TIME);
> + assert_se(barrier_wait_abortion(&b));
> + assert_se(0);
> + }),
> + TEST_BARRIER_WAIT_ALARM(pid1),
> + ({
> + msleep(BASE_TIME * 2);
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test matched wait_abortion() with local abortion
> + * This runs wait_abortion() with local and remote abortion.
> + */
> +TEST_BARRIER(test_barrier_wait_abortion_local,
> + ({
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_abort(&b));
> + assert_se(!barrier_wait_abortion(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid1),
> + ({
> + assert_se(barrier_abort(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test unmatched wait_abortion() with local abortion
> + * This runs wait_abortion() with only local abortion. This must time out.
> + */
> +TEST_BARRIER(test_barrier_wait_abortion_local_unmatched,
> + ({
> + malarm(BASE_TIME);
> + assert_se(barrier_abort(&b));
> + assert_se(!barrier_wait_abortion(&b));
> + assert_se(0);
> + }),
> + TEST_BARRIER_WAIT_ALARM(pid1),
> + ({
> + msleep(BASE_TIME * 2);
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test child exit
> + * Place barrier and sync with the child. The child only exits()s, which should
> + * cause an implicit abortion and wake the parent.
> + */
> +TEST_BARRIER(test_barrier_exit,
> + ({
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid1),
> + ({
> + malarm(BASE_TIME * 10);
> + assert_se(barrier_place(&b));
> + assert_se(!barrier_sync(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +/*
> + * Test child exit with sleep
> + * Same as test_barrier_exit but verifies the test really works due to the
> + * child-exit. We add a usleep() which triggers the alarm in the parent and
> + * causes the test to time out.
> + */
> +TEST_BARRIER(test_barrier_no_exit,
> + ({
> + msleep(BASE_TIME * 2);
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid1),
> + ({
> + malarm(BASE_TIME);
> + assert_se(barrier_place(&b));
> + assert_se(!barrier_sync(&b));
> + }),
> + TEST_BARRIER_WAIT_ALARM(pid2));
> +
> +/*
> + * Test pending exit against sync
> + * The parent places a barrier *and* exits. The 20ms wait in the child
> + * guarantees both are pending. However, our logic prefers pending barriers over
> + * pending exit-abortions (unlike normal abortions), thus the wait_next() must
> + * succeed, same for the sync_next() as our local barrier-count is smaller than
> + * the remote. Once we place a barrier our count is equal, so the sync still
> + * succeeds. Only if we place one more barrier, we're ahead of the remote, thus
> + * we will fail due to HUP on the pipe.
> + */
> +TEST_BARRIER(test_barrier_pending_exit,
> + ({
> + malarm(BASE_TIME * 4);
> + msleep(BASE_TIME * 2);
> + assert_se(barrier_wait_next(&b));
> + assert_se(barrier_sync_next(&b));
> + assert_se(barrier_place(&b));
> + assert_se(barrier_sync_next(&b));
> + assert_se(barrier_place(&b));
> + assert_se(!barrier_sync_next(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid1),
> + ({
> + assert_se(barrier_place(&b));
> + }),
> + TEST_BARRIER_WAIT_SUCCESS(pid2));
> +
> +int main(int argc, char *argv[]) {
> + log_parse_environment();
> + log_open();
> +
> + test_barrier_sync();
> + test_barrier_wait_next();
> + test_barrier_wait_next_twice();
> + test_barrier_wait_next_twice_sync();
> + test_barrier_wait_next_twice_local();
> + test_barrier_wait_next_twice_local_sync();
> + test_barrier_sync_next();
> + test_barrier_sync_next_local();
> + test_barrier_sync_next_local_abort();
> + test_barrier_wait_abortion();
> + test_barrier_wait_abortion_unmatched();
> + test_barrier_wait_abortion_local();
> + test_barrier_wait_abortion_local_unmatched();
> + test_barrier_exit();
> + test_barrier_no_exit();
> + test_barrier_pending_exit();
> +
> + return 0;
> +}
> --
> 2.0.1
>
> _______________________________________________
> systemd-devel mailing list
> systemd-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/systemd-devel
More information about the systemd-devel
mailing list