lock/unlock mismatch in ttm_bo.c
Christian König
christian.koenig at amd.com
Wed Jan 24 09:03:22 UTC 2018
That patch won't work correctly like this.
When the lock is dropped it is possible that the BO is removed from the
ddelete list and ttm_bo_cleanup_refs() starts to wait for the wrong
reservation object.
I think we can remove the wait for bo->resv now and always wait for
bo->ttm_resv, but I'm not 100% sure.
Need to double check the code as well,
Christian.
Am 23.01.2018 um 20:25 schrieb Tom St Denis:
> On 22/01/18 01:42 AM, Chunming Zhou wrote:
>>
>>
>> On 2018年01月20日 02:23, Tom St Denis wrote:
>>> On 19/01/18 01:14 PM, Tom St Denis wrote:
>>>> Hi all,
>>>>
>>>> In the function ttm_bo_cleanup_refs() it seems possible to get to
>>>> line 551 without entering the block on 516 which means you'll be
>>>> unlocking a mutex that wasn't locked.
>>>>
>>>> Now it might be that in the course of the API this pattern cannot
>>>> be expressed but it's not clear from the function alone that that
>>>> is the case.
>>>
>>>
>>> Looking further it seems the behaviour depends on locking in parent
>>> callers. That's kinda a no-no right? Shouldn't the lock be
>>> taken/released in the same function ideally?
>> Same feelings
>>
>> Regards,
>> David Zhou
>
> Attached is a patch that addresses this.
>
> I can't see any obvious race in functions that call
> ttm_bo_cleanup_refs() between the time they let go of the lock and the
> time it's taken again in the call.
>
> Running it on my system doesn't produce anything notable though the
> KASAN with DRI_PRIME=1 issue is still there (this patch neither causes
> that nor fixes it).
>
> Tom
More information about the amd-gfx
mailing list