[waffle] [PATCH v3] wcore_error_unittest: Fix race condition leading to forever loop
Chad Versace
chad.versace at linux.intel.com
Thu May 31 16:14:04 PDT 2012
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 05/31/2012 09:49 AM, Pauli Nieminen wrote:
> Reading non-volatile value in loop without function calls allows C
> compiler to generate forever loop without checking the memory value. To
> avoid compiler optimization causing trouble take lock when reading the
> number of threads.
The commit's subject claims that the cause of the non-terminating loop is
a race condition. But the commit's message claims that the cause is an
aggressive optimization on a variable that is in fact volatile, but
not declared as volatile. These two causes are orthogonal.
> Signed-off-by: Pauli Nieminen <pauli.nieminen at linux.intel.com>
> ---
>
> If Mac doesn't have barries then we have to stick simulating barriers. Even
> tough barrier in linux can be implemented with single atomic int and futex.
> We could probably do a pseudo barrier with same princible using mutex and
> atomic int. But who cares as long as test is quarenteed to work correctly.
>
> tests/unittests/waffle/core/wcore_error_unittest.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/tests/unittests/waffle/core/wcore_error_unittest.c b/tests/unittests/waffle/core/wcore_error_unittest.c
> index 06ebb78..e9fdb94 100644
> --- a/tests/unittests/waffle/core/wcore_error_unittest.c
> +++ b/tests/unittests/waffle/core/wcore_error_unittest.c
> @@ -191,6 +191,7 @@ TEST(wcore_error, thread_local)
> pthread_mutex_t mutex;
> pthread_cond_t cond;
> int num_threads_waiting = 0;
> + int thread_count;
>
> pthread_t threads[NUM_THREADS];
> struct thread_arg thread_args[NUM_THREADS];
> @@ -213,8 +214,13 @@ TEST(wcore_error, thread_local)
>
> // Wait for all threads to set their error codes, thus giving
> // the threads opportunity to clobber each other's codes.
> - while (num_threads_waiting < NUM_THREADS)
> - ;;
> + do {
> + pthread_mutex_lock(&mutex); {
> + thread_count = num_threads_waiting;
> + pthread_mutex_unlock(&mutex);
> + }
> + } while (thread_count < NUM_THREADS);
> +
If the problem were a race condition, then this hunk would fix it.
But, the real problem is aggressive constant-propagation on a volatile
variable. I don't believe the hunk fixes the constant-propagation problem.
Like num_threads_waiting, thread_count is also accessed only within the
local scope, and therefore a clever optimizer can infer that thread_local
is also always 0.
If we're lucky, though, the additional activity in the neighborhood of
thread_count may sufficiently confuse the compiler and prevent
constant-propagation. But that would just be a lucky accident.
I discussed this with Kenneth Graunke and Jordan Justen, and they
came to the same conclusion.
I believe the solution is as simple as declaring num_threads_waiting as
volatile. I'll submit a patch to do that.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQIcBAEBAgAGBQJPx/s6AAoJEAIvNt057x8i9twP/RQcXzaOTIYHMshn3Q+St6Dy
vsXs6b67p/3WuGwHWA9BSRGEJ3Cl6HujSt7HvM7qeMrNlq2K8vYHxy+R5D/HXqqI
OcSZVUPrO1hMoOVdx2n37AJsahL3mUhRh9lkT6U593XroYaugxXWD7iOCMLec16o
cCpG+HI+7lYuAeQnMJA1rAg464M9Df3mAdCBldMRGRjNCyyBhX52e1XrCoIF1H/g
GOqy9Pzw2ZFzHPrzNN+mmXYtitYzKSbzh8RktAhDdblPJyxSIpTlw/v7fEYo13v1
d//vPI/3dIS5XNYepbqIq45O9joCrpcc0z0vlhKhyXJjron2UphOVSpNB1BT6MSo
ehMVIAcZLK5781YzKwotGpuFlWP2Knh8ODxph9r1lkDHpafYae1KEAhCN1FoSajt
Bab3IKNgmOOriaNVUh/2pmzYgoAcrt/rUUwLTe1wgQxmpsag0HTHiuGpMOCYzIrE
vsmRI1P372HimHPtnqgZd1Qr1DkeS5fQmfsIIaiKypST1pOmzi8cH7Dr3YJR6BFi
VedqKPpdzAGZjgsQX0ra8z69dXUYXimdsCCfUF1t6xFqJe+tAz16i3VxLvO1bJWV
4wkOy/5kaRTClG5eI7cGqOMjyo5GfaIco98fZNiq/UGMpYiaRONLkYrqrAh5sN6l
qZFXjoj9oiUGOV8K2WrO
=xyaT
-----END PGP SIGNATURE-----
More information about the waffle
mailing list