[PATCH libdrm 2/2] drm:fix race issue between two bo functions(v2)

Liu, Monk Monk.Liu at amd.com
Tue Aug 8 03:21:30 UTC 2017


One mistake from me:
> BTW: You can completely merge amdgpu_bo_reference() and
>amdgpu_bo_free_internal() into amdgpu_bo_free() if you like.

Yes, they can be merged, will do...

BR Monk

-----Original Message-----
From: Liu, Monk 
Sent: Tuesday, August 8, 2017 10:57 AM
To: 'Christian König' <deathsimple at vodafone.de>; amd-gfx at lists.freedesktop.org
Subject: RE: [PATCH libdrm 2/2] drm:fix race issue between two bo functions(v2)

Christian

See below code (if it is what you mean) and it shows why I don't understand your point:

(1) I assume you tend to change to below logic:

	/* The buffer already exists, just bump the refcount. */
	if (bo && atomic_inc_return(&dev->bo_table_mutex) > 1) {0
		pthread_mutex_unlock(&dev->bo_table_mutex);

		output->buf_handle = bo;
		output->alloc_size = bo->alloc_size;
		return 0;
	} else {
		atomic_dec(&bo->refcount); //potential segment fault 
	}

If bo is NULL, above function will get you segment fault when CPU go to line "atomic_dec ..."

Besides, I think my original logic works fine, no need to change it at all,

(2)
>And then change amdgpu_bo_free_internal() to remove the key only when the BO is still the correct one.

I don't understand above move, can you give me the details ?


(3)
> BTW: You can completely merge amdgpu_bo_reference() and
>amdgpu_bo_free_internal() into amdgpu_bo_free() if you like.


NO, really don't want, because amdgpu_Bo_reference is used widely in UMD and I don't want to change UMD at all Bo_reference have no way to merge with bo_free_internal, while bo_free_internal() can be merged with bo_free()


BR Monk



-----Original Message-----
From: Christian König [mailto:deathsimple at vodafone.de]
Sent: Monday, August 7, 2017 11:03 PM
To: Liu, Monk <Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org
Subject: Re: [PATCH libdrm 2/2] drm:fix race issue between two bo functions(v2)

Am 07.08.2017 um 16:35 schrieb Monk Liu:
> From: Monk Liu <monk.liu at amd.com>
>
> there is race issue between two threads on amdgpu_bo_reference and 
> amdgpu_bo_import, this patch tends to fix it by moving the 
> pthread_mutex_lock out of bo_free_internal and move to bo_reference to 
> cover the update_reference part.
>
> The mutex_unlock in bo_import should also cover bo refcount 
> increasement.
>
> Change-Id: I1f65eacf74cd28cc0d3a71ef2f7a19b890d63c29
> Signed-off-by: Monk Liu <monk.liu at amd.com>
> ---
>   amdgpu/amdgpu_bo.c       |  5 +----
>   amdgpu/amdgpu_internal.h | 15 ++++++++++++---
>   2 files changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c index
> 07eb743..82f38c0 100644
> --- a/amdgpu/amdgpu_bo.c
> +++ b/amdgpu/amdgpu_bo.c
> @@ -55,14 +55,12 @@ static void amdgpu_close_kms_handle(amdgpu_device_handle dev,
>   void amdgpu_bo_free_internal(amdgpu_bo_handle bo)
>   {
>   	/* Remove the buffer from the hash tables. */
> -	pthread_mutex_lock(&bo->dev->bo_table_mutex);
>   	util_hash_table_remove(bo->dev->bo_handles,
>   			       (void*)(uintptr_t)bo->handle);
>   	if (bo->flink_name) {
>   		util_hash_table_remove(bo->dev->bo_flink_names,
>   				       (void*)(uintptr_t)bo->flink_name);
>   	}
> -	pthread_mutex_unlock(&bo->dev->bo_table_mutex);
>   
>   	/* Release CPU access. */
>   	if (bo->cpu_map_count > 0) {
> @@ -340,10 +338,9 @@ int amdgpu_bo_import(amdgpu_device_handle dev,
>   	}
>   
>   	if (bo) {
> -		pthread_mutex_unlock(&dev->bo_table_mutex);
> -
>   		/* The buffer already exists, just bump the refcount. */
>   		atomic_inc(&bo->refcount);
> +		pthread_mutex_unlock(&dev->bo_table_mutex);


This could be improved a bit. Do something like the following here:

if (bo && atomic_inc_return(&bo->refcount) > 1) {
     pthread_mutex_unlock(&dev->bo_table_mutex);
...
} else {
     /* The BO is about to be destroyed, just create a new one */
     atomic_dec(&bo->refcount)
}

And then change amdgpu_bo_free_internal() to remove the key only when the BO is still the correct one.

BTW: You can completely merge amdgpu_bo_reference() and
amdgpu_bo_free_internal() into amdgpu_bo_free() if you like.

Regards,
Christian.

>   
>   		output->buf_handle = bo;
>   		output->alloc_size = bo->alloc_size; diff --git 
> a/amdgpu/amdgpu_internal.h b/amdgpu/amdgpu_internal.h index
> 0508131..79da0e7 100644
> --- a/amdgpu/amdgpu_internal.h
> +++ b/amdgpu/amdgpu_internal.h
> @@ -143,7 +143,7 @@ void amdgpu_vamgr_deinit(struct amdgpu_bo_va_mgr *mgr);
>   uint64_t amdgpu_vamgr_find_va(struct amdgpu_bo_va_mgr *mgr, uint64_t size,
>   				uint64_t alignment, uint64_t base_required);
>   
> -void amdgpu_vamgr_free_va(struct amdgpu_bo_va_mgr *mgr, uint64_t va,
> +void amdgpu_vamgr_free_va(struct amdgpu_bo_va_mgr *mgr, uint64_t va,
>   				uint64_t size);
>   
>   int amdgpu_query_gpu_info_init(amdgpu_device_handle dev); @@ -193,8
> +193,17 @@ static inline bool update_references(atomic_t *dst, 
> +atomic_t *src)
>   static inline void amdgpu_bo_reference(struct amdgpu_bo **dst,
>   					struct amdgpu_bo *src)
>   {
> -	if (update_references(&(*dst)->refcount, &src->refcount))
> -		amdgpu_bo_free_internal(*dst);
> +	pthread_mutex_t *mlock;
> +	struct amdgpu_bo* bo = *dst;
> +
> +	assert(bo != NULL);
> +	mlock = &bo->dev->bo_table_mutex;
> +	pthread_mutex_lock(mlock);
> +
> +	if (update_references(&bo->refcount, src?&src->refcount:NULL))
> +		amdgpu_bo_free_internal(bo);
> +
> +	pthread_mutex_unlock(mlock);
>   	*dst = src;
>   }
>   




More information about the amd-gfx mailing list