[PATCH] drm/amdgpu: remove PT BOs when unmapping
Kuehling, Felix
Felix.Kuehling at amd.com
Wed Oct 30 15:47:53 UTC 2019
On 2019-10-30 9:52 a.m., Christian König wrote:
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:
>> The issue is PT BOs are not freed when unmapping VA,
>> which causes vram usage accumulated is huge in some
>> memory stress test, such as kfd big buffer stress test.
>> Function amdgpu_vm_bo_update_mapping() is called by both
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The
>> solution is replacing amdgpu_vm_bo_update_mapping() in
>> amdgpu_vm_clear_freed() with removing PT BOs function
>> to save vram usage.
>
> NAK, that is intentional behavior.
>
> Otherwise we can run into out of memory situations when page tables
> need to be allocated again under stress.
That's a bit arbitrary and inconsistent. We are freeing page tables in
other situations, when a mapping uses huge pages in
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?
I'm actually a bit surprised that the huge-page handling in
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page
tables when a BO is unmapped.
Regards,
Felix
>
> Regards,
> Christian.
>
>>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924
>> Signed-off-by: Eric Huang <JinhuiEric.Huang at amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56
>> +++++++++++++++++++++++++++++-----
>> 1 file changed, 48 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f4c3b2..8a480c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm)
>> }
>> /**
>> + * amdgpu_vm_remove_ptes - free PT BOs
>> + *
>> + * @adev: amdgpu device structure
>> + * @vm: amdgpu vm structure
>> + * @start: start of mapped range
>> + * @end: end of mapped entry
>> + *
>> + * Free the page table level.
>> + */
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,
>> + struct amdgpu_vm *vm, uint64_t start, uint64_t end)
>> +{
>> + struct amdgpu_vm_pt_cursor cursor;
>> + unsigned shift, num_entries;
>> +
>> + amdgpu_vm_pt_start(adev, vm, start, &cursor);
>> + while (cursor.level < AMDGPU_VM_PTB) {
>> + if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> + return -ENOENT;
>> + }
>> +
>> + while (cursor.pfn < end) {
>> + amdgpu_vm_free_table(cursor.entry);
>> + num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);
>> +
>> + if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {
>> + /* Next ptb entry */
>> + shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> + cursor.pfn += 1ULL << shift;
>> + cursor.pfn &= ~((1ULL << shift) - 1);
>> + cursor.entry++;
>> + } else {
>> + /* Next ptb entry in next pd0 entry */
>> + amdgpu_vm_pt_ancestor(&cursor);
>> + shift = amdgpu_vm_level_shift(adev, cursor.level - 1);
>> + cursor.pfn += 1ULL << shift;
>> + cursor.pfn &= ~((1ULL << shift) - 1);
>> + amdgpu_vm_pt_descendant(adev, &cursor);
>> + }
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +/**
>> * amdgpu_vm_clear_freed - clear freed BOs in the PT
>> *
>> * @adev: amdgpu_device pointer
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>> struct dma_fence **fence)
>> {
>> struct amdgpu_bo_va_mapping *mapping;
>> - uint64_t init_pte_value = 0;
>> struct dma_fence *f = NULL;
>> int r;
>> @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct
>> amdgpu_device *adev,
>> struct amdgpu_bo_va_mapping, list);
>> list_del(&mapping->list);
>> - if (vm->pte_support_ats &&
>> - mapping->start < AMDGPU_GMC_HOLE_START)
>> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>> + r = amdgpu_vm_remove_ptes(adev, vm,
>> + (mapping->start + 0x1ff) & (~0x1ffll),
>> + (mapping->last + 1) & (~0x1ffll));
>> - r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> - mapping->start, mapping->last,
>> - init_pte_value, 0, NULL, &f);
>> amdgpu_vm_free_mapping(adev, vm, mapping, f);
>> if (r) {
>> dma_fence_put(f);
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device
>> *adev,
>> }
>> return 0;
>> -
>> }
>> /**
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list