[PATCH] drm/ttm: Fix the memory delay free issue

Mon Jul 15 10:47:52 UTC 2019

Hi Christian,
     Do you think we could free all those bos those are in current destroy list when the current resv is signal in ttm_bo_cleanup_refs?

Best wishes
Emily Deng

>-----Original Message-----
>From: Koenig, Christian <Christian.Koenig at amd.com>
>Sent: Monday, July 15, 2019 5:41 PM
>To: Deng, Emily <Emily.Deng at amd.com>; Zhou, David(ChunMing)
><David1.Zhou at amd.com>
>Cc: amd-gfx at lists.freedesktop.org
>Subject: Re: [PATCH] drm/ttm: Fix the memory delay free issue
>
>> Do you think we don't need to fix it?
>No, when the application is exhausting memory then we can't expect anything
>else here.
>
>See memory freeing is always delayed until it isn't used any more or when the
>process is killed after access is prevented (by clearing page tables for example).
>
>What we could do is maybe look into why we don't block until the memory is
>freed during command submission, but apart from that this sounds like
>perfectly expected behavior.
>
>Regards,
>Christian.
>
>Am 15.07.19 um 11:34 schrieb Deng, Emily:
>> Hi Christian,
>>      As has this behavior, when test vulkan cts allocation test, it will
>exhausting the memory, and cause out of memory. Do you think we don't
>need to fix it?
>>
>> Best wishes
>> Emily Deng
>>> -----Original Message-----
>>> From: Koenig, Christian <Christian.Koenig at amd.com>
>>> Sent: Monday, July 15, 2019 5:31 PM
>>> To: Deng, Emily <Emily.Deng at amd.com>; Zhou, David(ChunMing)
>>> <David1.Zhou at amd.com>
>>> Cc: amd-gfx at lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/ttm: Fix the memory delay free issue
>>>
>>> Hi guys,
>>>
>>>> Do you have any suggestion about this? For per vm bo, it seems
>>>> always
>>> delay to free the ttm bo.
>>> Yeah, and that is correct behavior.
>>>
>>> Since we don't know who is using a per-vm BO we need to wait for all
>>> command submissions in flight when it is freed.
>>>
>>> For this we copy the current state of the shared reservation object
>>> into the private one in ttm_bo_individualize_resv.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 15.07.19 um 08:49 schrieb Deng, Emily:
>>>> Hi David,
>>>>        You are right, it will copy per-vm resv.
>>>>        But currently, it still has the delay free issue which non
>>>> per vm bo doesn't
>>> has. Maybe it already has new fences append to this resv object before
>copy.
>>>> Hi Christian,
>>>>       Do you have any suggestion about this? For per vm bo, it seems
>>>> always
>>> delay to free the ttm bo.
>>>> Best wishes
>>>> Emily Deng
>>>>> -----Original Message-----
>>>>> From: Zhou, David(ChunMing) <David1.Zhou at amd.com>
>>>>> Sent: Wednesday, July 10, 2019 9:28 PM
>>>>> To: Deng, Emily <Emily.Deng at amd.com>; amd-
>gfx at lists.freedesktop.org
>>>>> Subject: Re: [PATCH] drm/ttm: Fix the memory delay free issue
>>>>>
>>>>> It doesn't make sense that freeing BO still uses per-vm resv.
>>>>>
>>>>> I remember when BO is in release list, its resv will be from per-vm resv
>copy.
>>>>> Could you check it?
>>>>>
>>>>> -David
>>>>>
>>>>> 在 2019/7/10 17:29, Emily Deng 写道:
>>>>>> For vulkan cts allocation test cases, they will create a series of
>>>>>> bos, and then free them. As it has lots of alloction test cases
>>>>>> with the same vm, as per vm bo feature enable, all of those bos'
>>>>>> resv are the same. But the bo free is quite slow, as they use the
>>>>>> same resv object, for every time, free a bo, it will check the
>>>>>> resv whether signal, if it signal, then will free it. But as the
>>>>>> test cases will continue to create bo, and the resv fence is
>>>>>> increasing. So the free is more
>>>>> slower than creating. It will cause memory exhausting.
>>>>>> Method:
>>>>>> When the resv signal, release all the bos which are use the same
>>>>>> resv object.
>>>>>>
>>>>>> Signed-off-by: Emily Deng <Emily.Deng at amd.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/ttm/ttm_bo.c | 29 ++++++++++++++++++++++++----
>-
>>>>>>     1 file changed, 24 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c index f9a3d4c..57ec59b 100644
>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> @@ -543,6 +543,7 @@ static int ttm_bo_cleanup_refs(struct
>>>>> ttm_buffer_object *bo,
>>>>>>     {
>>>>>>     	struct ttm_bo_global *glob = bo->bdev->glob;
>>>>>>     	struct reservation_object *resv;
>>>>>> +	struct ttm_buffer_object *resv_bo, *resv_bo_next;
>>>>>>     	int ret;
>>>>>>
>>>>>>     	if (unlikely(list_empty(&bo->ddestroy)))
>>>>>> @@ -566,10 +567,14 @@ static int ttm_bo_cleanup_refs(struct
>>>>> ttm_buffer_object *bo,
>>>>>>     							   interruptible,
>>>>>>     							   30 * HZ);
>>>>>>
>>>>>> -		if (lret < 0)
>>>>>> +		if (lret < 0) {
>>>>>> +			kref_put(&bo->list_kref, ttm_bo_release_list);
>>>>>>     			return lret;
>>>>>> -		else if (lret == 0)
>>>>>> +		}
>>>>>> +		else if (lret == 0) {
>>>>>> +			kref_put(&bo->list_kref, ttm_bo_release_list);
>>>>>>     			return -EBUSY;
>>>>>> +		}
>>>>>>
>>>>>>     		spin_lock(&glob->lru_lock);
>>>>>>     		if (unlock_resv &&
>>>>>> !kcl_reservation_object_trylock(bo->resv))
>>>>> { @@
>>>>>> -582,6 +587,7 @@ static int ttm_bo_cleanup_refs(struct
>>>>>> ttm_buffer_object
>>>>> *bo,
>>>>>>     			 * here.
>>>>>>     			 */
>>>>>>     			spin_unlock(&glob->lru_lock);
>>>>>> +			kref_put(&bo->list_kref, ttm_bo_release_list);
>>>>>>     			return 0;
>>>>>>     		}
>>>>>>     		ret = 0;
>>>>>> @@ -591,15 +597,29 @@ static int ttm_bo_cleanup_refs(struct
>>>>> ttm_buffer_object *bo,
>>>>>>     		if (unlock_resv)
>>>>>>     			kcl_reservation_object_unlock(bo->resv);
>>>>>>     		spin_unlock(&glob->lru_lock);
>>>>>> +		kref_put(&bo->list_kref, ttm_bo_release_list);
>>>>>>     		return ret;
>>>>>>     	}
>>>>>>
>>>>>>     	ttm_bo_del_from_lru(bo);
>>>>>>     	list_del_init(&bo->ddestroy);
>>>>>>     	kref_put(&bo->list_kref, ttm_bo_ref_bug);
>>>>>> -
>>>>>>     	spin_unlock(&glob->lru_lock);
>>>>>>     	ttm_bo_cleanup_memtype_use(bo);
>>>>>> +	kref_put(&bo->list_kref, ttm_bo_release_list);
>>>>>> +
>>>>>> +	spin_lock(&glob->lru_lock);
>>>>>> +	list_for_each_entry_safe(resv_bo, resv_bo_next, &bo->bdev-
>>>>>> ddestroy, ddestroy) {
>>>>>> +		if (resv_bo->resv == bo->resv) {
>>>>>> +			ttm_bo_del_from_lru(resv_bo);
>>>>>> +			list_del_init(&resv_bo->ddestroy);
>>>>>> +			spin_unlock(&glob->lru_lock);
>>>>>> +			ttm_bo_cleanup_memtype_use(resv_bo);
>>>>>> +			kref_put(&resv_bo->list_kref,
>ttm_bo_release_list);
>>>>>> +			spin_lock(&glob->lru_lock);
>>>>>> +		}
>>>>>> +	}
>>>>>> +	spin_unlock(&glob->lru_lock);
>>>>>>
>>>>>>     	if (unlock_resv)
>>>>>>     		kcl_reservation_object_unlock(bo->resv);
>>>>>> @@ -639,9 +659,8 @@ static bool ttm_bo_delayed_delete(struct
>>>>> ttm_bo_device *bdev, bool remove_all)
>>>>>>     			ttm_bo_cleanup_refs(bo, false, !remove_all,
>true);
>>>>>>     		} else {
>>>>>>     			spin_unlock(&glob->lru_lock);
>>>>>> +			kref_put(&bo->list_kref, ttm_bo_release_list);
>>>>>>     		}
>>>>>> -
>>>>>> -		kref_put(&bo->list_kref, ttm_bo_release_list);
>>>>>>     		spin_lock(&glob->lru_lock);
>>>>>>     	}
>>>>>>     	list_splice_tail(&removed, &bdev->ddestroy);