[PATCH 1/4] drm/amdgpu: Optimization of critical section
Christian König
deathsimple at vodafone.de
Tue Jun 13 10:23:00 UTC 2017
Am 13.06.2017 um 08:29 schrieb axie:
> On 2017-06-12 07:00 PM, Christian König wrote:
>> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>> This patch is to move a loop of unref BOs and
>>> several memory free function calls out of
>>> critical sections.
>>>
>>> Signed-off-by: Alex Xie <AlexBin.Xie at amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> index a664987..02c138f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct
>>> amdgpu_fpriv *fpriv, int id)
>>> /* Another user may have a reference to this list still */
>>> mutex_lock(&list->lock);
>>> mutex_unlock(&list->lock);
>>> + mutex_unlock(&fpriv->bo_list_lock);
>>> amdgpu_bo_list_free(list);
>>> }
>>> - mutex_unlock(&fpriv->bo_list_lock);
>>> + else {
>>> + mutex_unlock(&fpriv->bo_list_lock);
>>> + }
>>
>> You could move the unlock of bo_list_lock even before the if.
>>
>> But since you pointed it out there is quite a bug in this function:
>>> /* Another user may have a reference to this list
>>> still */
>>> mutex_lock(&list->lock);
>>> mutex_unlock(&list->lock);
>> Not sure if that is still up to date, but that use case used to be
>> illegal with mutexes.
> As I understand this piece of code, these mutex_lock and mutex_unlock
> pair are used to make sure all other tasks have finished
> access of the bo list. Another side of this story is in function
> amdgpu_bo_list_get. These two piece of codes together make sure
> we can safely destroy bo list.
Yeah, the idea behind the code is correct. But using mutexes in that way
is illegal, see here https://lwn.net/Articles/575460/.
I'm not sure if that is still up to date, but in ancient times you
needed to avoid patterns like this:
mutex_unlock(&obj->lock);
kfree(obj);
Anyway I suggest we just replace the whole bo_list handling with and RCU
and refcount based implementation. That should avoid the whole locking
for the read only code path.
Regards,
Christian.
>
> Otherwise we can easily simplify these lockings.
> Let me give an example here.
>
> In function amdgpu_bo_list_get, assuming we change the code like this:
> ...
> down_read(&fpriv->bo_list_lock);
> result = idr_find(&fpriv->bo_list_handles, id);
> up_read(&fpriv->bo_list_lock);
> /**Line 1. Task A was scheduled away from CPU**/
> if (result)
> mutex_lock(&result->lock);
> ...
>
> In function amdgpu_bo_list_destroy, assuming we change the code like
> this:
> ...
> down_write(&fpriv->bo_list_lock);
> list = idr_remove(&fpriv->bo_list_handles, id);
> up_write(&fpriv->bo_list_lock);
> if (list) {
> /* Another user may have a reference to this list still */
> mutex_lock(&list->lock);
> mutex_unlock(&list->lock);
> amdgpu_bo_list_free(list);
> }
> }
>
> When task A is running in function amdgpu_bo_list_get in line 1, CPU
> scheduler takes CPU away from task A.
> Then Task B run function amdgpu_bo_list_destroy. Task B can run all
> the way to destroy and free mutex.
> Later Task A is back to run. The mutex result->lock has been destroyed
> by task B. Now task A try to lock a mutex
> which has been destroyed and freed.
>
>
>>
>> Christian.
>>
>>> }
>>> static int amdgpu_bo_list_set(struct amdgpu_device *adev,
>>
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list