[Mesa-stable] [Mesa-dev] [PATCH 1/2] Fix races during _mesa_HashWalk().

Timothy Arceri timothy.arceri at collabora.com
Sat Nov 5 22:35:37 UTC 2016


On Sat, 2016-11-05 at 15:28 +0100, Steinar H. Gunderson wrote:
> There is currently no protection against walking a hash (using
> _mesa_HashWalk()) and modifying it at the same time, for instance by
> inserting
> or deleting elements. This leads to segfaults in multithreaded code
> if e.g.
> someone calls glTexImage2D (which may have to walk the list of FBOs)
> while
> another thread is calling glDeleteFramebuffers on another thread with
> the two
> contexts sharing lists.
> 
> The reason for this is that _mesa_HashWalk() doesn't actually take
> the mutex
> that normally protects the hash; it takes an entirely different
> mutex.
> Thus, walks are only protected against other walks, and there is also
> no
> outer lock taking this. There is an old comment saying that this is
> to fix
> problems with deadlock if the callback needs to take a mutex, but now
> that we
> have recursive mutexes, this is a non-issue.

Can you please change this line to say something like. With this change
we switch to using a recursive mutex, which avoids the deadlock.

I was a little confused. At first I thought it was already using a
recursive mutex.

Otherwise the change looks correct to me.

Reviewed-by: Timothy Arceri <timothy.arceri at collabora.com>


One thing that concerns me is that we can now call _mesa_HashRemove
from the callback in _mesa_HashDeleteAll.

I think we may need a patch before this one that changes
_mesa_HashDeleteAll to wrap _mesa_hash_table_remove(table->ht, entry);
with a deleted entry check otherwise the entry counts will get all
messed up.

   /* Check that the cb didn't already delete the entry. If not
    * remove it.
    */
   if (entry->key != table->ht->deleted_key) {
      _mesa_hash_table_remove(table->ht, entry);
   }
 
> 
> A demonstration Helgrind hit from a real application:
> 
> ==13412== Possible data race during write of size 8 at 0x3498C6A8 by
> thread #1
> ==13412== Locks held: 2, at addresses 0x1AF09530 0x2B3DF400
> ==13412==    at 0x1F040C99: _mesa_hash_table_remove
> (hash_table.c:395)
> ==13412==    by 0x1EE98174: _mesa_HashRemove_unlocked (hash.c:350)
> ==13412==    by 0x1EE98174: _mesa_HashRemove (hash.c:365)
> ==13412==    by 0x1EE2372D: _mesa_DeleteFramebuffers
> (fbobject.c:2669)
> ==13412==    by 0x6105AA4:
> movit::ResourcePool::cleanup_unlinked_fbos(void*)
> (resource_pool.cpp:473)
> ==13412==    by 0x610615B: movit::ResourcePool::release_fbo(unsigned
> int) (resource_pool.cpp:442)
> [...]
> ==13412== This conflicts with a previous read of size 8 by thread #20
> ==13412== Locks held: 2, at addresses 0x1AF09558 0x1AF73318
> ==13412==    at 0x1F040CD9: _mesa_hash_table_next_entry
> (hash_table.c:415)
> ==13412==    by 0x1EE982A8: _mesa_HashWalk (hash.c:426)
> ==13412==    by 0x1EED6DFD: _mesa_update_fbo_texture.part.33
> (teximage.c:2683)
> ==13412==    by 0x1EED9410: _mesa_update_fbo_texture
> (teximage.c:3043)
> ==13412==    by 0x1EED9410: teximage (teximage.c:3073)
> ==13412==    by 0x1EEDA28F: _mesa_TexImage2D (teximage.c:3105)
> ==13412==    by 0x166A68: operator() (mixer.cpp:454)
> 
> There are many more interactions than just these two possible.
> 
> Cc: 11.2 12.0 13.0 <mesa-stable at lists.freedesktop.org>
> Signed-off-by: Steinar H. Gunderson <steinar+mesa at gunderson.no>
> ---
>  src/mesa/main/hash.c | 18 +++++++-----------
>  1 file changed, 7 insertions(+), 11 deletions(-)
> 
> diff --git a/src/mesa/main/hash.c b/src/mesa/main/hash.c
> index 7d8a5fd..670438a 100644
> --- a/src/mesa/main/hash.c
> +++ b/src/mesa/main/hash.c
> @@ -59,7 +59,6 @@ struct _mesa_HashTable {
>     struct hash_table *ht;
>     GLuint MaxKey;                        /**< highest key inserted
> so far */
>     mtx_t Mutex;                /**< mutual exclusion lock */
> -   mtx_t WalkMutex;            /**< for _mesa_HashWalk() */
>     GLboolean InDeleteAll;                /**< Debug check */
>     /** Value that would be in the table for DELETED_KEY_VALUE. */
>     void *deleted_key_data;
> @@ -129,8 +128,11 @@ _mesa_NewHashTable(void)
>        }
>  
>        _mesa_hash_table_set_deleted_key(table->ht,
> uint_key(DELETED_KEY_VALUE));
> -      mtx_init(&table->Mutex, mtx_plain);
> -      mtx_init(&table->WalkMutex, mtx_plain);
> +      /*
> +       * Needs to be recursive, since the callback in
> _mesa_HashWalk()
> +       * is allowed to call _mesa_HashRemove().
> +       */
> +      mtx_init(&table->Mutex, mtx_recursive);
>     }
>     else {
>        _mesa_error_no_memory(__func__);
> @@ -161,7 +163,6 @@ _mesa_DeleteHashTable(struct _mesa_HashTable
> *table)
>     _mesa_hash_table_destroy(table->ht, NULL);
>  
>     mtx_destroy(&table->Mutex);
> -   mtx_destroy(&table->WalkMutex);
>     free(table);
>  }
>  
> @@ -401,11 +402,6 @@ _mesa_HashDeleteAll(struct _mesa_HashTable
> *table,
>  
>  /**
>   * Walk over all entries in a hash table, calling callback function
> for each.
> - * Note: we use a separate mutex in this function to avoid a
> recursive
> - * locking deadlock (in case the callback calls _mesa_HashRemove())
> and to
> - * prevent multiple threads/contexts from getting tangled up.
> - * A lock-less version of this function could be used when the table
> will
> - * not be modified.
>   * \param table  the hash table to walk
>   * \param callback  the callback function
>   * \param userData  arbitrary pointer to pass along to the callback
> @@ -422,13 +418,13 @@ _mesa_HashWalk(const struct _mesa_HashTable
> *table,
>  
>     assert(table);
>     assert(callback);
> -   mtx_lock(&table2->WalkMutex);
> +   mtx_lock(&table2->Mutex);
>     hash_table_foreach(table->ht, entry) {
>        callback((uintptr_t)entry->key, entry->data, userData);
>     }
>     if (table->deleted_key_data)
>        callback(DELETED_KEY_VALUE, table->deleted_key_data,
> userData);
> -   mtx_unlock(&table2->WalkMutex);
> +   mtx_unlock(&table2->Mutex);
>  }
>  
>  static void


More information about the mesa-stable mailing list