[Mesa-dev] [PATCH 5/5] clover: Add a mutex to guard event::chain and event::wait_count

Sat May 9 04:29:11 PDT 2015

Tom Stellard <thomas.stellard at amd.com> writes:

> This mutex effectively prevents an event's chain or wait_count from
> being updated while it is in the process of triggering.  Otherwise it
> may be possible to add to an event's chain after it has been triggered,
> which causes the chained event to never be triggered.
> ---
>  src/gallium/state_trackers/clover/core/event.cpp | 3 +++
>  src/gallium/state_trackers/clover/core/event.hpp | 1 +
>  2 files changed, 4 insertions(+)
>
> diff --git a/src/gallium/state_trackers/clover/core/event.cpp b/src/gallium/state_trackers/clover/core/event.cpp
> index da227bb..646fd38 100644
> --- a/src/gallium/state_trackers/clover/core/event.cpp
> +++ b/src/gallium/state_trackers/clover/core/event.cpp
> @@ -38,6 +38,7 @@ event::~event() {
>  
>  void
>  event::trigger() {
> +   std::lock_guard<std::mutex> lock(trigger_mutex);

The mutex you added in PATCH 3 already protects wait_count, and it would
make sense for it to protect the chain too.  And the "deps" array.  I
don't think there's any reason to add another mutex to protect roughly
the same data set.

There are two more problems with this: It prevents reentrancy (because
you call the user-specified action_ok with the mutex locked -- worse,
the actions of all recursively chained events will run with the mutex
locked) and OTOH can lead to a deadlock because the trigger() method of
each chained event will in turn try to lock its own mutex while the
parent's mutex is still locked.  It could be argued that, because the
dependency graph of events is acyclic, there's no possibility for
deadlock if we agree on, say, always locking the mutex of a dependency
before the mutex of a dependent event.  But that seems rather difficult
to enforce because this isn't the only place where we would end up
locking several events simultaneously -- There's also ::abort() and
::chain().

Instead we could avoid these problems altogether by refactoring
::trigger() and ::abort() slightly, I'll reply with a patch that does
just that.  It should apply on top of your PATCH 3 if you take into
account the comments from my reply to that patch.

>     if (!--wait_count) {
>        signalled_cv.notify_all();
>        action_ok(*this);
> @@ -54,6 +55,7 @@ event::abort(cl_int status) {
>     _status = status;
>     action_fail(*this);
>  
> +   std::lock_guard<std::mutex> lock(trigger_mutex);

This has the same reentrancy and deadlock problems as ::trigger().

>     while (!_chain.empty()) {
>        _chain.back()().abort(status);
>        _chain.pop_back();
> @@ -67,6 +69,7 @@ event::signalled() const {
>  
>  void
>  event::chain(event &ev) {
> +   std::lock_guard<std::mutex> lock(trigger_mutex);

::chain() modifies both "this" and "ev", we should take both mutexes
here.  We can use std::lock() to lock them at the same time safely using
a deadlock avoidance algorithm.

>     if (!signalled()) {
>        ev.wait_count++;
>        _chain.push_back(ev);
> diff --git a/src/gallium/state_trackers/clover/core/event.hpp b/src/gallium/state_trackers/clover/core/event.hpp
> index dffafb9..a64fbba 100644
> --- a/src/gallium/state_trackers/clover/core/event.hpp
> +++ b/src/gallium/state_trackers/clover/core/event.hpp
> @@ -90,6 +90,7 @@ namespace clover {
>        std::vector<intrusive_ref<event>> _chain;
>        std::condition_variable signalled_cv;
>        std::mutex signalled_mutex;
> +      std::mutex trigger_mutex;
>     };
>  
>     ///
> -- 
> 2.0.4
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150509/b33a1282/attachment.sig>