[PATCH] drm/ttm: make ttm_bo_unpin more defensive

Thomas Hellström (Intel) thomas_os at shipmail.org
Tue Mar 16 10:38:53 UTC 2021


Hi,

On 3/16/21 10:27 AM, Daniel Vetter wrote:
> On Mon, Mar 15, 2021 at 08:00:30PM +0100, Thomas Hellström (Intel) wrote:
>> On 3/15/21 7:47 PM, Christian König wrote:
>>>
>>> Am 15.03.21 um 18:08 schrieb Thomas Hellström (Intel):
>>>> On 3/15/21 11:26 AM, Christian König wrote:
>>>>>
>>>>> Am 13.03.21 um 19:29 schrieb Thomas Hellström (Intel):
>>>>>> Hi, Christian
>>>>>>
>>>>>> On 3/12/21 10:38 AM, Christian König wrote:
>>>>>>> We seem to have some more driver bugs than thought.
>>>>>>>
>>>>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>>>>> ---
>>>>>>>    include/drm/ttm/ttm_bo_api.h | 6 ++++--
>>>>>>>    1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/include/drm/ttm/ttm_bo_api.h
>>>>>>> b/include/drm/ttm/ttm_bo_api.h
>>>>>>> index 4fb523dfab32..df9fe596e7c5 100644
>>>>>>> --- a/include/drm/ttm/ttm_bo_api.h
>>>>>>> +++ b/include/drm/ttm/ttm_bo_api.h
>>>>>>> @@ -603,9 +603,11 @@ static inline void
>>>>>>> ttm_bo_pin(struct ttm_buffer_object *bo)
>>>>>>>    static inline void ttm_bo_unpin(struct ttm_buffer_object *bo)
>>>>>>>    {
>>>>>>>        dma_resv_assert_held(bo->base.resv);
>>>>>>> -    WARN_ON_ONCE(!bo->pin_count);
>>>>>>>        WARN_ON_ONCE(!kref_read(&bo->kref));
>>>>>>> -    --bo->pin_count;
>>>>>>> +    if (bo->pin_count)
>>>>>>> +        --bo->pin_count;
>>>>>>> +    else
>>>>>>> +        WARN_ON_ONCE(true);
>>>>>>>    }
>>>>>>>      int ttm_mem_evict_first(struct ttm_device *bdev,
>>>>>> Since I now have been staring for half a year at the code of
>>>>>> the driver that made pinning an art, I have a couple of
>>>>>> suggestions here, Could we use an atomic for pin_count,
>>>>>> allowing unlocked unpinning but require the lock only for
>>>>>> pin_count transition 0->1, (but unlocked for
>>>>>> pin_if_already_pinned). In particular I think vmwgfx would
>>>>>> benefit from unlocked unpins. Also if the atomic were a
>>>>>> refcount_t, that would probably give you the above
>>>>>> behaviour?
>>>>> Nope, I've considered this as well while moving the pin count into TTM.
>>>>>
>>>>> The problem is that you not only need the BO reserved for 0->1
>>>>> transitions, but also for 1->0 transitions to move the BO on the
>>>>> LRU correctly.
>>>> Ah, and there is no way to have us know the correct LRU list without
>>>> reservation?
>>> Not really, there is always the chance that CPU A is reducing the count
>>> from 1->0 while CPU B is doing 0->1 and you end up with a LRU status
>>> which doesn't match the pin count.
>>>
>>> We could try to do something like a loop updating the LRU status until
>>> it matches the pin count, but the implications of that are usually
>>> really nasty.
>>>
>> OK, yeah I was more thinking along the lines of protecting the LRU status
>> with the global lru lock and unpin would then be
>>
>> if (refcount_dec_and_lock(&bo->pin_count, &ttm_glob.lru_lock)) {
>>      add_to_relevant_lrus(bo, bo->lru_status);
>>      spin_unlock(&ttm_glob.lru_lock);
>> }
>>
>> But looking at ttm_bo_move_to_lru_tail() I realize that's not really trivial
>> anymore. I hope it's doable at some point though.
>>
>> But meanwhile, perhaps TTM needs to accept and pave over that drivers are in
>> fact destroying pinned buffers?
> Do we have more trouble than the very fancy tricks vmwgfx does? If so I
> think we could do a small helper that like ttm_dont_check_unpin() to shut
> it up. Since vmwgfx drivers tend to not be loaded with any other drivers
> that shouldn't reduce any test coverage. And allows vmwgfx folks to figure
> out what to do - maybe you do need your own in-house pin/unpin for these
> special bo?
>
> I did try to parse your reply in the other thread, and tbh I didn't grock
> it.

Not sure if you mean the description was unclear or if you thought it 
was a bad idea, but in case the former, what I mean is

static void ttm_bo_pin(struct ttm_buffer_object *bo)
{

dma_resv_assert_held()                        // No surprises if an 
evictor determined that this object is not pinned.

     if (!refcount_inc_not_zero(&bo->pin_count)) { // Could be made 
ttm_bo_pin_if_pinned() and exported if there are users
         spin_lock(&ttm_glob.lru_lock);            // Don't race with 
unpin 1->0
         if (refcount_read(&bo->pin_count) == 0 && bo->lru)
             ttm_bo_del_from_lru(bo);
         refcount_inc(&bo->pin_count);
         spin_unlock(&ttm_glob.lru_lock);
     }
}

static void ttm_bo_unpin(struct ttm_buffer_object *bo)
{
     if (refcount_dec_and_lock(&bo->pin_count, &ttm_glob.lru_lock)) {
         ttm_bo_move_to_lru_tail(bo, bo->lru_mem_type, bo->lru_prio,
                     NULL);
         spin_unlock(&ttm_glob.lru_lock);
     }
}

In addition, bo->lru_mem_type and bo->lru_prio would need to be 
protected by the lru lock, and updated together with the LRU list 
position, which would be the extra complexity in fastpaths. Wouldn't 
that resolve the pin - lru inconsistency?

But yeah if vmwgfx is the only driver hitting trouble because of this, 
then I agree let's leave it for when / if it becomes needed. Having had 
that requirement in the Intel driver would have complicated the dma_resv 
work quite a lot.

/Thomas


>
> Also I think a comment why we need dma_resv (maybe in the kerneldoc for
> pin count), i.e. that it's needed to keep lru state in sync with pin state
> would be really good?
> -Daniel
>
>> /Thomas
>>
>>
>>
>>
>>
>>> Christian.
>>>
>>>> /Thomas
>>>>
>>>>
>>>>> Christian.
>>>>>
>>>>>> /Thomas
>>>>>>
>>>>>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel


More information about the dri-devel mailing list