[PATCH] drm/ttm: make ttm_bo_unpin more defensive
Christian König
ckoenig.leichtzumerken at gmail.com
Mon Mar 15 18:47:00 UTC 2021
Am 15.03.21 um 18:08 schrieb Thomas Hellström (Intel):
>
> On 3/15/21 11:26 AM, Christian König wrote:
>>
>>
>> Am 13.03.21 um 19:29 schrieb Thomas Hellström (Intel):
>>> Hi, Christian
>>>
>>> On 3/12/21 10:38 AM, Christian König wrote:
>>>> We seem to have some more driver bugs than thought.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>> ---
>>>> include/drm/ttm/ttm_bo_api.h | 6 ++++--
>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/include/drm/ttm/ttm_bo_api.h
>>>> b/include/drm/ttm/ttm_bo_api.h
>>>> index 4fb523dfab32..df9fe596e7c5 100644
>>>> --- a/include/drm/ttm/ttm_bo_api.h
>>>> +++ b/include/drm/ttm/ttm_bo_api.h
>>>> @@ -603,9 +603,11 @@ static inline void ttm_bo_pin(struct
>>>> ttm_buffer_object *bo)
>>>> static inline void ttm_bo_unpin(struct ttm_buffer_object *bo)
>>>> {
>>>> dma_resv_assert_held(bo->base.resv);
>>>> - WARN_ON_ONCE(!bo->pin_count);
>>>> WARN_ON_ONCE(!kref_read(&bo->kref));
>>>> - --bo->pin_count;
>>>> + if (bo->pin_count)
>>>> + --bo->pin_count;
>>>> + else
>>>> + WARN_ON_ONCE(true);
>>>> }
>>>> int ttm_mem_evict_first(struct ttm_device *bdev,
>>>
>>> Since I now have been staring for half a year at the code of the
>>> driver that made pinning an art, I have a couple of suggestions
>>> here, Could we use an atomic for pin_count, allowing unlocked
>>> unpinning but require the lock only for pin_count transition 0->1,
>>> (but unlocked for pin_if_already_pinned). In particular I think
>>> vmwgfx would benefit from unlocked unpins. Also if the atomic were a
>>> refcount_t, that would probably give you the above behaviour?
>>
>> Nope, I've considered this as well while moving the pin count into TTM.
>>
>> The problem is that you not only need the BO reserved for 0->1
>> transitions, but also for 1->0 transitions to move the BO on the LRU
>> correctly.
>
> Ah, and there is no way to have us know the correct LRU list without
> reservation?
Not really, there is always the chance that CPU A is reducing the count
from 1->0 while CPU B is doing 0->1 and you end up with a LRU status
which doesn't match the pin count.
We could try to do something like a loop updating the LRU status until
it matches the pin count, but the implications of that are usually
really nasty.
Christian.
>
> /Thomas
>
>
>>
>> Christian.
>>
>>>
>>> /Thomas
>>>
>>>
More information about the dri-devel
mailing list