<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Actually I do prevent to remove in-use pts by this:</p>
<p>+ r = amdgpu_vm_remove_ptes(adev, vm,<br>
+ (mapping->start + 0x1ff) & (~0x1ffll),<br>
+ (mapping->last + 1) & (~0x1ffll));</p>
<p>Which is only removing aligned page table for 2M. And I have tested it at least on KFD tests without anything broken.</p>
<p>By the way, I am not familiar with memory staff. This patch is the best I can do for now. Could you take a look at the Jira ticket SWDEV-201443 ? and find the better solution. Thanks!</p>
<p>Regards,</p>
<p>Eric</p>
<div class="moz-cite-prefix">On 2019-10-30 1:57 p.m., Christian König wrote:<br>
</div>
<blockquote type="cite" cite="mid:461cc802-e7c5-f968-1cb4-5e55a306e780@gmail.com">
<div class="moz-cite-prefix">One thing I've forgotten:<br>
<br>
What you could maybe do to improve the situation is to join adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how the chances are that the ranges are freed all together.<br>
<br>
The only other alternative I can see would be to check the mappings of a range in amdgpu_update_ptes() and see if you could walk the tree up if the valid flag is not set and there are no mappings left for a page table.
<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 30.10.19 um 18:42 schrieb Koenig, Christian:<br>
</div>
<blockquote type="cite" cite="mid:b5d9309e-a32b-8243-8c4d-cfd4e77e09e1@amd.com">
<div class="moz-cite-prefix">
<blockquote type="cite">The vaild flag doesn't take effect in this function.</blockquote>
That's irrelevant.<br>
<br>
See what amdgpu_vm_update_ptes() does is to first determine the fragment size:<br>
<blockquote type="cite">amdgpu_vm_fragment(params, frag_start, end, flags, &frag, &frag_end);<br>
</blockquote>
<br>
Then we walk down the tree:<br>
<blockquote type="cite"> amdgpu_vm_pt_start(adev, params->vm, start, &cursor);<br>
while (cursor.pfn < end) {<br>
</blockquote>
<br>
And make sure that the page tables covering the address range are actually allocated:<br>
<blockquote type="cite"> r = amdgpu_vm_alloc_pts(params->adev, params->vm, &cursor);</blockquote>
<br>
Then we update the tables with the flags and addresses and free up subsequent tables in the case of huge pages or freed up areas:<br>
<blockquote type="cite"> /* Free all child entries */<br>
while (cursor.pfn < frag_start) {<br>
amdgpu_vm_free_pts(adev, params->vm, &cursor);<br>
amdgpu_vm_pt_next(adev, &cursor);<br>
}<br>
</blockquote>
<br>
This is the maximum you can free, cause all other page tables are not completely covered by the range and so potentially still in use.<br>
<br>
And I have the strong suspicion that this is what your patch is actually doing wrong. In other words you are also freeing page tables which are only partially covered by the range and so potentially still in use.<br>
<br>
Since we don't have any tracking how many entries in a page table are currently valid and how many are invalid we actually can't implement what you are trying to do here. So the patch is definitely somehow broken.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:<br>
</div>
<blockquote type="cite" cite="mid:b8ad3c90-42d0-512d-5ba0-af330eab30a1@amd.com">
<p>The vaild flag doesn't take effect in this function. amdgpu_vm_alloc_pts() is always executed that only depended on "cursor.pfn < end". The valid flag has only been checked on here for asic below GMC v9:</p>
<p>if (adev->asic_type < CHIP_VEGA10 &&<br>
(flags & AMDGPU_PTE_VALID))...</p>
<p>Regards,</p>
<p>Eric<br>
</p>
<div class="moz-cite-prefix">On 2019-10-30 12:30 p.m., Koenig, Christian wrote:<br>
</div>
<blockquote type="cite" cite="mid:3f4b6803-ec66-44ca-b55a-8bccf4236632@email.android.com">
<div dir="auto">
<div><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">Am 30.10.2019 17:19 schrieb "Huang, JinHuiEric" <a class="moz-txt-link-rfc2396E" href="mailto:JinHuiEric.Huang@amd.com" moz-do-not-send="true">
<JinHuiEric.Huang@amd.com></a>:<br type="attribution">
<blockquote class="quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<p>I tested it that it saves a lot of vram on KFD big buffer stress test. I think there are two reasons:</p>
<p>1. Calling <font size="2"><span style="font-size:11pt">amdgpu_vm_update_ptes() during unmapping will allocate unnecessary pts, because there is no flag to determine if the VA is mapping or unmapping in function
</span></font><br>
<font size="2"><span style="font-size:11pt"><font size="2"><span style="font-size:11pt">amdgpu_vm_update_ptes(). It saves the most of memory.</span></font></span></font></p>
</div>
</blockquote>
</div>
</div>
</div>
<div dir="auto">That's not correct. The valid flag is used for this.</div>
<div dir="auto"><br>
</div>
<div dir="auto">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<p><font size="2"><span style="font-size:11pt"><font size="2"><span style="font-size:11pt">2. Intentionally removing those unmapping pts is logical expectation, although it is not removing so much pts.<br>
</span></font></span></font></p>
</div>
</blockquote>
</div>
</div>
</div>
<div dir="auto">Well I actually don't see a change to what update_ptes is doing and have the strong suspicion that the patch is simply broken.</div>
<div dir="auto"><br>
</div>
<div dir="auto">You either free page tables which are potentially still in use or update_pte doesn't free page tables when the valid but is not set.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Regards,</div>
<div dir="auto">Christian.</div>
<div dir="auto"><br>
</div>
<div dir="auto"><br>
</div>
<div dir="auto"><br>
</div>
<div dir="auto">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<p><font size="2"><span style="font-size:11pt"><font size="2"><span style="font-size:11pt"></span></font></span></font></p>
<p><font size="2"><span style="font-size:11pt"><font size="2"><span style="font-size:11pt">Regards,</span></font></span></font></p>
<p><font size="2"><span style="font-size:11pt"><font size="2"><span style="font-size:11pt">Eric<br>
</span></font></span></font></p>
<div>On 2019-10-30 11:57 a.m., Koenig, Christian wrote:<br>
</div>
<blockquote>
<div dir="auto">
<div><br>
<div><br>
<div class="elided-text">Am 30.10.2019 16:47 schrieb "Kuehling, Felix" <a href="mailto:Felix.Kuehling@amd.com" moz-do-not-send="true">
<Felix.Kuehling@amd.com></a>:<br type="attribution">
<blockquote style="margin:0 0 0
0.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div><font size="2"><span style="font-size:11pt">
<div>On 2019-10-30 9:52 a.m., Christian König wrote:<br>
> Am 29.10.19 um 21:06 schrieb Huang, JinHuiEric:<br>
>> The issue is PT BOs are not freed when unmapping VA,<br>
>> which causes vram usage accumulated is huge in some<br>
>> memory stress test, such as kfd big buffer stress test.<br>
>> Function amdgpu_vm_bo_update_mapping() is called by both<br>
>> amdgpu_vm_bo_update() and amdgpu_vm_clear_freed(). The<br>
>> solution is replacing amdgpu_vm_bo_update_mapping() in<br>
>> amdgpu_vm_clear_freed() with removing PT BOs function<br>
>> to save vram usage.<br>
><br>
> NAK, that is intentional behavior.<br>
><br>
> Otherwise we can run into out of memory situations when page tables <br>
> need to be allocated again under stress.<br>
<br>
That's a bit arbitrary and inconsistent. We are freeing page tables in <br>
other situations, when a mapping uses huge pages in <br>
amdgpu_vm_update_ptes. Why not when a mapping is destroyed completely?<br>
<br>
I'm actually a bit surprised that the huge-page handling in <br>
amdgpu_vm_update_ptes isn't kicking in to free up lower-level page <br>
tables when a BO is unmapped.<br>
</div>
</span></font></div>
</blockquote>
</div>
</div>
</div>
<div dir="auto"><br>
</div>
<div dir="auto">Well it does free the lower level, and that is already causing problems (that's why I added the reserved space).</div>
<div dir="auto"><br>
</div>
<div dir="auto">What we don't do is freeing the higher levels.</div>
<div dir="auto"><br>
</div>
<div dir="auto">E.g. when you free a 2MB BO we free the lowest level, if we free a 1GB BO we free the two lowest levels etc...</div>
<div dir="auto"><br>
</div>
<div dir="auto">The problem with freeing the higher levels is that you don't know who is also using this. E.g. we would need to check all entries when we unmap one.</div>
<div dir="auto"><br>
</div>
<div dir="auto">It's simply not worth it for a maximum saving of 2MB per VM.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Writing this I'm actually wondering how you ended up in this issue? There shouldn't be much savings from this.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Regards,</div>
<div dir="auto">Christian.</div>
<div dir="auto"><br>
</div>
<div dir="auto">
<div>
<div class="elided-text">
<blockquote style="margin:0 0 0
0.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div><font size="2"><span style="font-size:11pt">
<div><br>
Regards,<br>
Felix<br>
<br>
<br>
><br>
> Regards,<br>
> Christian.<br>
><br>
>><br>
>> Change-Id: Ic24e35bff8ca85265b418a642373f189d972a924<br>
>> Signed-off-by: Eric Huang <a href="mailto:JinhuiEric.Huang@amd.com" moz-do-not-send="true">
<JinhuiEric.Huang@amd.com></a><br>
>> ---<br>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 56 <br>
>> +++++++++++++++++++++++++++++-----<br>
>> 1 file changed, 48 insertions(+), 8 deletions(-)<br>
>><br>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c <br>
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
>> index 0f4c3b2..8a480c7 100644<br>
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
>> @@ -1930,6 +1930,51 @@ static void amdgpu_vm_prt_fini(struct <br>
>> amdgpu_device *adev, struct amdgpu_vm *vm)<br>
>> }<br>
>> /**<br>
>> + * amdgpu_vm_remove_ptes - free PT BOs<br>
>> + *<br>
>> + * @adev: amdgpu device structure<br>
>> + * @vm: amdgpu vm structure<br>
>> + * @start: start of mapped range<br>
>> + * @end: end of mapped entry<br>
>> + *<br>
>> + * Free the page table level.<br>
>> + */<br>
>> +static int amdgpu_vm_remove_ptes(struct amdgpu_device *adev,<br>
>> + struct amdgpu_vm *vm, uint64_t start, uint64_t end)<br>
>> +{<!-- --><br>
>> + struct amdgpu_vm_pt_cursor cursor;<br>
>> + unsigned shift, num_entries;<br>
>> +<br>
>> + amdgpu_vm_pt_start(adev, vm, start, &cursor);<br>
>> + while (cursor.level < AMDGPU_VM_PTB) {<!-- --><br>
>> + if (!amdgpu_vm_pt_descendant(adev, &cursor))<br>
>> + return -ENOENT;<br>
>> + }<br>
>> +<br>
>> + while (cursor.pfn < end) {<!-- --><br>
>> + amdgpu_vm_free_table(cursor.entry);<br>
>> + num_entries = amdgpu_vm_num_entries(adev, cursor.level - 1);<br>
>> +<br>
>> + if (cursor.entry != &cursor.parent->entries[num_entries - 1]) {<!-- --><br>
>> + /* Next ptb entry */<br>
>> + shift = amdgpu_vm_level_shift(adev, cursor.level - 1);<br>
>> + cursor.pfn += 1ULL << shift;<br>
>> + cursor.pfn &= ~((1ULL << shift) - 1);<br>
>> + cursor.entry++;<br>
>> + } else {<!-- --><br>
>> + /* Next ptb entry in next pd0 entry */<br>
>> + amdgpu_vm_pt_ancestor(&cursor);<br>
>> + shift = amdgpu_vm_level_shift(adev, cursor.level - 1);<br>
>> + cursor.pfn += 1ULL << shift;<br>
>> + cursor.pfn &= ~((1ULL << shift) - 1);<br>
>> + amdgpu_vm_pt_descendant(adev, &cursor);<br>
>> + }<br>
>> + }<br>
>> +<br>
>> + return 0;<br>
>> +}<br>
>> +<br>
>> +/**<br>
>> * amdgpu_vm_clear_freed - clear freed BOs in the PT<br>
>> *<br>
>> * @adev: amdgpu_device pointer<br>
>> @@ -1949,7 +1994,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device <br>
>> *adev,<br>
>> struct dma_fence **fence)<br>
>> {<!-- --><br>
>> struct amdgpu_bo_va_mapping *mapping;<br>
>> - uint64_t init_pte_value = 0;<br>
>> struct dma_fence *f = NULL;<br>
>> int r;<br>
>> @@ -1958,13 +2002,10 @@ int amdgpu_vm_clear_freed(struct <br>
>> amdgpu_device *adev,<br>
>> struct amdgpu_bo_va_mapping, list);<br>
>> list_del(&mapping->list);<br>
>> - if (vm->pte_support_ats &&<br>
>> - mapping->start < AMDGPU_GMC_HOLE_START)<br>
>> - init_pte_value = AMDGPU_PTE_DEFAULT_ATC;<br>
>> + r = amdgpu_vm_remove_ptes(adev, vm,<br>
>> + (mapping->start + 0x1ff) & (~0x1ffll),<br>
>> + (mapping->last + 1) & (~0x1ffll));<br>
>> - r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,<br>
>> - mapping->start, mapping->last,<br>
>> - init_pte_value, 0, NULL, &f);<br>
>> amdgpu_vm_free_mapping(adev, vm, mapping, f);<br>
>> if (r) {<!-- --><br>
>> dma_fence_put(f);<br>
>> @@ -1980,7 +2021,6 @@ int amdgpu_vm_clear_freed(struct amdgpu_device <br>
>> *adev,<br>
>> }<br>
>> return 0;<br>
>> -<br>
>> }<br>
>> /**<br>
><br>
> _______________________________________________<br>
> amd-gfx mailing list<br>
> <a href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
> <a href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">
https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><br>
</div>
</span></font></div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a></pre>
</blockquote>
<br>
</blockquote>
</body>
</html>