[Mesa-dev] [PATCH 1/2] Fix races during _mesa_HashWalk().
Timothy Arceri
timothy.arceri at collabora.com
Sat Nov 5 22:35:37 UTC 2016
On Sat, 2016-11-05 at 15:28 +0100, Steinar H. Gunderson wrote:
> There is currently no protection against walking a hash (using
> _mesa_HashWalk()) and modifying it at the same time, for instance by
> inserting
> or deleting elements. This leads to segfaults in multithreaded code
> if e.g.
> someone calls glTexImage2D (which may have to walk the list of FBOs)
> while
> another thread is calling glDeleteFramebuffers on another thread with
> the two
> contexts sharing lists.
>
> The reason for this is that _mesa_HashWalk() doesn't actually take
> the mutex
> that normally protects the hash; it takes an entirely different
> mutex.
> Thus, walks are only protected against other walks, and there is also
> no
> outer lock taking this. There is an old comment saying that this is
> to fix
> problems with deadlock if the callback needs to take a mutex, but now
> that we
> have recursive mutexes, this is a non-issue.
Can you please change this line to say something like. With this change
we switch to using a recursive mutex, which avoids the deadlock.
I was a little confused. At first I thought it was already using a
recursive mutex.
Otherwise the change looks correct to me.
Reviewed-by: Timothy Arceri <timothy.arceri at collabora.com>
One thing that concerns me is that we can now call _mesa_HashRemove
from the callback in _mesa_HashDeleteAll.
I think we may need a patch before this one that changes
_mesa_HashDeleteAll to wrap _mesa_hash_table_remove(table->ht, entry);
with a deleted entry check otherwise the entry counts will get all
messed up.
/* Check that the cb didn't already delete the entry. If not
* remove it.
*/
if (entry->key != table->ht->deleted_key) {
_mesa_hash_table_remove(table->ht, entry);
}
>
> A demonstration Helgrind hit from a real application:
>
> ==13412== Possible data race during write of size 8 at 0x3498C6A8 by
> thread #1
> ==13412== Locks held: 2, at addresses 0x1AF09530 0x2B3DF400
> ==13412== at 0x1F040C99: _mesa_hash_table_remove
> (hash_table.c:395)
> ==13412== by 0x1EE98174: _mesa_HashRemove_unlocked (hash.c:350)
> ==13412== by 0x1EE98174: _mesa_HashRemove (hash.c:365)
> ==13412== by 0x1EE2372D: _mesa_DeleteFramebuffers
> (fbobject.c:2669)
> ==13412== by 0x6105AA4:
> movit::ResourcePool::cleanup_unlinked_fbos(void*)
> (resource_pool.cpp:473)
> ==13412== by 0x610615B: movit::ResourcePool::release_fbo(unsigned
> int) (resource_pool.cpp:442)
> [...]
> ==13412== This conflicts with a previous read of size 8 by thread #20
> ==13412== Locks held: 2, at addresses 0x1AF09558 0x1AF73318
> ==13412== at 0x1F040CD9: _mesa_hash_table_next_entry
> (hash_table.c:415)
> ==13412== by 0x1EE982A8: _mesa_HashWalk (hash.c:426)
> ==13412== by 0x1EED6DFD: _mesa_update_fbo_texture.part.33
> (teximage.c:2683)
> ==13412== by 0x1EED9410: _mesa_update_fbo_texture
> (teximage.c:3043)
> ==13412== by 0x1EED9410: teximage (teximage.c:3073)
> ==13412== by 0x1EEDA28F: _mesa_TexImage2D (teximage.c:3105)
> ==13412== by 0x166A68: operator() (mixer.cpp:454)
>
> There are many more interactions than just these two possible.
>
> Cc: 11.2 12.0 13.0 <mesa-stable at lists.freedesktop.org>
> Signed-off-by: Steinar H. Gunderson <steinar+mesa at gunderson.no>
> ---
> src/mesa/main/hash.c | 18 +++++++-----------
> 1 file changed, 7 insertions(+), 11 deletions(-)
>
> diff --git a/src/mesa/main/hash.c b/src/mesa/main/hash.c
> index 7d8a5fd..670438a 100644
> --- a/src/mesa/main/hash.c
> +++ b/src/mesa/main/hash.c
> @@ -59,7 +59,6 @@ struct _mesa_HashTable {
> struct hash_table *ht;
> GLuint MaxKey; /**< highest key inserted
> so far */
> mtx_t Mutex; /**< mutual exclusion lock */
> - mtx_t WalkMutex; /**< for _mesa_HashWalk() */
> GLboolean InDeleteAll; /**< Debug check */
> /** Value that would be in the table for DELETED_KEY_VALUE. */
> void *deleted_key_data;
> @@ -129,8 +128,11 @@ _mesa_NewHashTable(void)
> }
>
> _mesa_hash_table_set_deleted_key(table->ht,
> uint_key(DELETED_KEY_VALUE));
> - mtx_init(&table->Mutex, mtx_plain);
> - mtx_init(&table->WalkMutex, mtx_plain);
> + /*
> + * Needs to be recursive, since the callback in
> _mesa_HashWalk()
> + * is allowed to call _mesa_HashRemove().
> + */
> + mtx_init(&table->Mutex, mtx_recursive);
> }
> else {
> _mesa_error_no_memory(__func__);
> @@ -161,7 +163,6 @@ _mesa_DeleteHashTable(struct _mesa_HashTable
> *table)
> _mesa_hash_table_destroy(table->ht, NULL);
>
> mtx_destroy(&table->Mutex);
> - mtx_destroy(&table->WalkMutex);
> free(table);
> }
>
> @@ -401,11 +402,6 @@ _mesa_HashDeleteAll(struct _mesa_HashTable
> *table,
>
> /**
> * Walk over all entries in a hash table, calling callback function
> for each.
> - * Note: we use a separate mutex in this function to avoid a
> recursive
> - * locking deadlock (in case the callback calls _mesa_HashRemove())
> and to
> - * prevent multiple threads/contexts from getting tangled up.
> - * A lock-less version of this function could be used when the table
> will
> - * not be modified.
> * \param table the hash table to walk
> * \param callback the callback function
> * \param userData arbitrary pointer to pass along to the callback
> @@ -422,13 +418,13 @@ _mesa_HashWalk(const struct _mesa_HashTable
> *table,
>
> assert(table);
> assert(callback);
> - mtx_lock(&table2->WalkMutex);
> + mtx_lock(&table2->Mutex);
> hash_table_foreach(table->ht, entry) {
> callback((uintptr_t)entry->key, entry->data, userData);
> }
> if (table->deleted_key_data)
> callback(DELETED_KEY_VALUE, table->deleted_key_data,
> userData);
> - mtx_unlock(&table2->WalkMutex);
> + mtx_unlock(&table2->Mutex);
> }
>
> static void
More information about the mesa-dev
mailing list