[PATCH 2/2] drm/ttm: completely rework ttm_bo_delayed_delete

Thu Dec 14 14:02:09 UTC 2017

On 12/14/2017 02:17 PM, Christian König wrote:
> Am 14.12.2017 um 08:24 schrieb Thomas Hellstrom:
>> On 12/13/2017 09:55 PM, Thomas Hellstrom wrote:
>>> Hi, Christian,
>>>
>>> While this has probably already been committed, and looks like a 
>>> nice cleanup there are two things below I think needs fixing.
>>>
>>> On 11/15/2017 01:31 PM, Christian König wrote:
>>>> There is no guarantee that the next entry on the ddelete list stays on
>>>> the list when we drop the locks.
>>>>
>>>> Completely rework this mess by moving processed entries on a temporary
>>>> list.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>> ---
>>>>   drivers/gpu/drm/ttm/ttm_bo.c | 77 
>>>> ++++++++++++++------------------------------
>>>>   1 file changed, 25 insertions(+), 52 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> index 7c1eac4f4b4b..ad0afdd71f21 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> @@ -572,71 +572,47 @@ static int ttm_bo_cleanup_refs(struct 
>>>> ttm_buffer_object *bo,
>>>>    * Traverse the delayed list, and call ttm_bo_cleanup_refs on all
>>>>    * encountered buffers.
>>>>    */
>>>> -
>>>> -static int ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool 
>>>> remove_all)
>>>> +static bool ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool 
>>>> remove_all)
>>>>   {
>>>>       struct ttm_bo_global *glob = bdev->glob;
>>>> -    struct ttm_buffer_object *entry = NULL;
>>>> -    int ret = 0;
>>>> -
>>>> -    spin_lock(&glob->lru_lock);
>>>> -    if (list_empty(&bdev->ddestroy))
>>>> -        goto out_unlock;
>>>> +    struct list_head removed;
>>>> +    bool empty;
>>>>   -    entry = list_first_entry(&bdev->ddestroy,
>>>> -        struct ttm_buffer_object, ddestroy);
>>>> -    kref_get(&entry->list_kref);
>>>> +    INIT_LIST_HEAD(&removed);
>>>>   -    for (;;) {
>>>> -        struct ttm_buffer_object *nentry = NULL;
>>>> -
>>>> -        if (entry->ddestroy.next != &bdev->ddestroy) {
>>>> -            nentry = list_first_entry(&entry->ddestroy,
>>>> -                struct ttm_buffer_object, ddestroy);
>>>> -            kref_get(&nentry->list_kref);
>>>> -        }
>>>> -
>>>> -        ret = reservation_object_trylock(entry->resv) ? 0 : -EBUSY;
>>>> -        if (remove_all && ret) {
>>>> -            spin_unlock(&glob->lru_lock);
>>>> -            ret = reservation_object_lock(entry->resv, NULL);
>>>> -            spin_lock(&glob->lru_lock);
>>>> -        }
>>>> +    spin_lock(&glob->lru_lock);
>>>> +    while (!list_empty(&bdev->ddestroy)) {
>>>> +        struct ttm_buffer_object *bo;
>>>>   -        if (!ret)
>>>> -            ret = ttm_bo_cleanup_refs(entry, false, !remove_all,
>>>> -                          true);
>>>> -        else
>>>> -            spin_unlock(&glob->lru_lock);
>>>> +        bo = list_first_entry(&bdev->ddestroy, struct 
>>>> ttm_buffer_object,
>>>> +                      ddestroy);
>>>> +        kref_get(&bo->list_kref);
>>>> +        list_move_tail(&bo->ddestroy, &removed);
>>>> +        spin_unlock(&glob->lru_lock);
>>>>   -        kref_put(&entry->list_kref, ttm_bo_release_list);
>>>> -        entry = nentry;
>>>> +        reservation_object_lock(bo->resv, NULL);
>>>
>>> Reservation may be a long lived lock, and typically if the object is 
>>> reserved here, it's being evicted somewhere and there might be a 
>>> substantial stall, which isn't really acceptable in the global 
>>> workqueue. Better to move on to the next bo.
>>> This function was really intended to be non-blocking, unless 
>>> remove_all == true. I even think it's safe to keep the spinlock held 
>>> on tryreserve?
>
> This approach doesn't really work with shared reservation objects and 
> was actually the originally reason why I stumbled over the function 
> being a bit buggy.
>

Actually I think it was correct, but very non-deterministic and 
difficult to follow. Both me and Maarten had our passes trying to make 
it look better but failed :(. It relied on the ddestroy list node either 
being on the ddestroy list or not on a list at all.
So if the ->next pointer was on the list, we'd continue iteration, even 
though the object might have moved on the list. But it looks much better 
now.

> The reservation object is a really busy lock because of all the 
> command submission going on. So when you only have a single trylock 
> every few ms it is rather unlikely that you can actually grab it. We 
> ended up with tons of objects on the ddestroy list which couldn't be 
> reaped because of this.

OK, I see.

>
>
> At least amdgpu tries to avoid to wait for any GPU operation while 
> holding the reservation locks, so at least for us that shouldn't be an 
> issue any more. And I'm going to pipeline delayed deletes as well 
> rather soon.
>
> What we could do here is to use a test like "bo->resv == 
> &bo->ttm_resv" and only trylock if it isn't a shared reservation 
> object. How about that?

Either that or a private workqueue separate from the global one is fine 
with me.

Thanks,
Thomas

>
> Regards,
> Christian.
>
>>>
>>>>   -        if (ret || !entry)
>>>> -            goto out;
>>>> +        spin_lock(&glob->lru_lock);
>>>> +        ttm_bo_cleanup_refs(bo, false, !remove_all, true);
>>>>   +        kref_put(&bo->list_kref, ttm_bo_release_list);
>>>
>>> Calling a release function in atomic context is a bad thing. Nobody 
>>> knows what locks needs to be taken in the release function and such 
>>> code is prone to lock inversion and sleep-while-atomic bugs. Not 
>>> long ago vfree() was even forbidden from atomic context. But here 
>>> it's easily avoidable.
>>
>> Hmm. It actually looks like ttm_bo_cleanup_refs unlocks the 
>> glob->lru_lock just loke ttm_bo_cleanup_refs_and_unlock did, so my 
>> latter comment actually isn't correct. Intuitively removing the 
>> "unlock" prefix from the function would also mean that the unlocking 
>> functionality went away, but that doesn't seem to be the case. Also 
>> the commit message "needed for the next patch" isn't very helpful 
>> when the next patch is actually commited much later...
>>
>> The first comment about trylocking still holds, though.
>>
>> /Thomas
>>
>>
>>
>>>
>>> /Thomas
>>>
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>>