<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">One thing I've forgotten:<br>
<br>
What you could maybe do to improve the situation is to join
adjacent ranges in amdgpu_vm_clear_freed(), but I'm not sure how
the chances are that the ranges are freed all together.<br>
<br>
The only other alternative I can see would be to check the
mappings of a range in amdgpu_update_ptes() and see if you could
walk the tree up if the valid flag is not set and there are no
mappings left for a page table. <br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 30.10.19 um 18:42 schrieb Koenig, Christian:<br>
</div>
<blockquote type="cite"
cite="mid:b5d9309e-a32b-8243-8c4d-cfd4e77e09e1@amd.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<div class="moz-cite-prefix">
<blockquote type="cite">The vaild flag doesn't take effect in
this function.</blockquote>
That's irrelevant.<br>
<br>
See what amdgpu_vm_update_ptes() does is to first determine the
fragment size:<br>
<blockquote type="cite">amdgpu_vm_fragment(params, frag_start,
end, flags, &frag, &frag_end);<br>
</blockquote>
<br>
Then we walk down the tree:<br>
<blockquote type="cite"> amdgpu_vm_pt_start(adev,
params->vm, start, &cursor);<br>
while (cursor.pfn < end) {<br>
</blockquote>
<br>
And make sure that the page tables covering the address range
are actually allocated:<br>
<blockquote type="cite"> r =
amdgpu_vm_alloc_pts(params->adev, params->vm,
&cursor);</blockquote>
<br>
Then we update the tables with the flags and addresses and free
up subsequent tables in the case of huge pages or freed up
areas:<br>
<blockquote type="cite"> /* Free all
child entries */<br>
while (cursor.pfn < frag_start) {<br>
amdgpu_vm_free_pts(adev,
params->vm, &cursor);<br>
amdgpu_vm_pt_next(adev,
&cursor);<br>
}<br>
</blockquote>
<br>
This is the maximum you can free, cause all other page tables
are not completely covered by the range and so potentially still
in use.<br>
<br>
And I have the strong suspicion that this is what your patch is
actually doing wrong. In other words you are also freeing page
tables which are only partially covered by the range and so
potentially still in use.<br>
<br>
Since we don't have any tracking how many entries in a page
table are currently valid and how many are invalid we actually
can't implement what you are trying to do here. So the patch is
definitely somehow broken.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 30.10.19 um 17:55 schrieb Huang, JinHuiEric:<br>
</div>
<blockquote type="cite"
cite="mid:b8ad3c90-42d0-512d-5ba0-af330eab30a1@amd.com">
<p>The vaild flag doesn't take effect in this function.
amdgpu_vm_alloc_pts() is always executed that only depended on
"cursor.pfn < end". The valid flag has only been checked on
here for asic below GMC v9:</p>
<p>if (adev->asic_type < CHIP_VEGA10 &&<br>
(flags & AMDGPU_PTE_VALID))...</p>
<p>Regards,</p>
<p>Eric<br>
</p>
<div class="moz-cite-prefix">On 2019-10-30 12:30 p.m., Koenig,
Christian wrote:<br>
</div>
<blockquote type="cite"
cite="mid:3f4b6803-ec66-44ca-b55a-8bccf4236632@email.android.com">
<div dir="auto">
<div><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">Am 30.10.2019 17:19 schrieb
"Huang, JinHuiEric" <a class="moz-txt-link-rfc2396E"
href="mailto:JinHuiEric.Huang@amd.com"
moz-do-not-send="true">
<JinHuiEric.Huang@amd.com></a>:<br
type="attribution">
<blockquote class="quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<p>I tested it that it saves a lot of vram on KFD
big buffer stress test. I think there are two
reasons:</p>
<p>1. Calling <font size="2"><span
style="font-size:11pt">amdgpu_vm_update_ptes()
during unmapping will allocate unnecessary
pts, because there is no flag to determine
if the VA is mapping or unmapping in
function
</span></font><br>
<font size="2"><span style="font-size:11pt"><font
size="2"><span style="font-size:11pt">amdgpu_vm_update_ptes().
It saves the most of memory.</span></font></span></font></p>
</div>
</blockquote>
</div>
</div>
</div>
<div dir="auto">That's not correct. The valid flag is used
for this.</div>
<div dir="auto"><br>
</div>
<div dir="auto">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<p><font size="2"><span style="font-size:11pt"><font
size="2"><span style="font-size:11pt">2.
Intentionally removing those unmapping
pts is logical expectation, although it
is not removing so much pts.<br>
</span></font></span></font></p>
</div>
</blockquote>
</div>
</div>
</div>
<div dir="auto">Well I actually don't see a change to what
update_ptes is doing and have the strong suspicion that
the patch is simply broken.</div>
<div dir="auto"><br>
</div>
<div dir="auto">You either free page tables which are
potentially still in use or update_pte doesn't free page
tables when the valid but is not set.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Regards,</div>
<div dir="auto">Christian.</div>
<div dir="auto"><br>
</div>
<div dir="auto"><br>
</div>
<div dir="auto"><br>
</div>
<div dir="auto">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<p><font size="2"><span style="font-size:11pt"><font
size="2"><span style="font-size:11pt"></span></font></span></font></p>
<p><font size="2"><span style="font-size:11pt"><font
size="2"><span style="font-size:11pt">Regards,</span></font></span></font></p>
<p><font size="2"><span style="font-size:11pt"><font
size="2"><span style="font-size:11pt">Eric<br>
</span></font></span></font></p>
<div>On 2019-10-30 11:57 a.m., Koenig, Christian
wrote:<br>
</div>
<blockquote>
<div dir="auto">
<div><br>
<div><br>
<div class="elided-text">Am 30.10.2019
16:47 schrieb "Kuehling, Felix" <a
href="mailto:Felix.Kuehling@amd.com"
moz-do-not-send="true">
<Felix.Kuehling@amd.com></a>:<br
type="attribution">
<blockquote style="margin:0 0 0
0.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div><font size="2"><span
style="font-size:11pt">
<div>On 2019-10-30 9:52 a.m.,
Christian König wrote:<br>
> Am 29.10.19 um 21:06
schrieb Huang, JinHuiEric:<br>
>> The issue is PT BOs
are not freed when unmapping
VA,<br>
>> which causes vram
usage accumulated is huge in
some<br>
>> memory stress test,
such as kfd big buffer stress
test.<br>
>> Function
amdgpu_vm_bo_update_mapping()
is called by both<br>
>> amdgpu_vm_bo_update()
and amdgpu_vm_clear_freed().
The<br>
>> solution is replacing
amdgpu_vm_bo_update_mapping()
in<br>
>>
amdgpu_vm_clear_freed() with
removing PT BOs function<br>
>> to save vram usage.<br>
><br>
> NAK, that is intentional
behavior.<br>
><br>
> Otherwise we can run into
out of memory situations when
page tables <br>
> need to be allocated
again under stress.<br>
<br>
That's a bit arbitrary and
inconsistent. We are freeing
page tables in <br>
other situations, when a
mapping uses huge pages in <br>
amdgpu_vm_update_ptes. Why not
when a mapping is destroyed
completely?<br>
<br>
I'm actually a bit surprised
that the huge-page handling in
<br>
amdgpu_vm_update_ptes isn't
kicking in to free up
lower-level page <br>
tables when a BO is unmapped.<br>
</div>
</span></font></div>
</blockquote>
</div>
</div>
</div>
<div dir="auto"><br>
</div>
<div dir="auto">Well it does free the lower
level, and that is already causing problems
(that's why I added the reserved space).</div>
<div dir="auto"><br>
</div>
<div dir="auto">What we don't do is freeing
the higher levels.</div>
<div dir="auto"><br>
</div>
<div dir="auto">E.g. when you free a 2MB BO we
free the lowest level, if we free a 1GB BO
we free the two lowest levels etc...</div>
<div dir="auto"><br>
</div>
<div dir="auto">The problem with freeing the
higher levels is that you don't know who is
also using this. E.g. we would need to check
all entries when we unmap one.</div>
<div dir="auto"><br>
</div>
<div dir="auto">It's simply not worth it for a
maximum saving of 2MB per VM.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Writing this I'm actually
wondering how you ended up in this issue?
There shouldn't be much savings from this.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Regards,</div>
<div dir="auto">Christian.</div>
<div dir="auto"><br>
</div>
<div dir="auto">
<div>
<div class="elided-text">
<blockquote style="margin:0 0 0
0.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div><font size="2"><span
style="font-size:11pt">
<div><br>
Regards,<br>
Felix<br>
<br>
<br>
><br>
> Regards,<br>
> Christian.<br>
><br>
>><br>
>> Change-Id:
Ic24e35bff8ca85265b418a642373f189d972a924<br>
>> Signed-off-by: Eric
Huang <a
href="mailto:JinhuiEric.Huang@amd.com"
moz-do-not-send="true">
<JinhuiEric.Huang@amd.com></a><br>
>> ---<br>
>>
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
| 56 <br>
>>
+++++++++++++++++++++++++++++-----<br>
>> 1 file changed, 48
insertions(+), 8 deletions(-)<br>
>><br>
>> diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
<br>
>>
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
>> index
0f4c3b2..8a480c7 100644<br>
>> ---
a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
>> +++
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
>> @@ -1930,6 +1930,51
@@ static void
amdgpu_vm_prt_fini(struct <br>
>> amdgpu_device *adev,
struct amdgpu_vm *vm)<br>
>> }<br>
>> /**<br>
>> + *
amdgpu_vm_remove_ptes - free
PT BOs<br>
>> + *<br>
>> + * @adev: amdgpu
device structure<br>
>> + * @vm: amdgpu vm
structure<br>
>> + * @start: start of
mapped range<br>
>> + * @end: end of
mapped entry<br>
>> + *<br>
>> + * Free the page
table level.<br>
>> + */<br>
>> +static int
amdgpu_vm_remove_ptes(struct
amdgpu_device *adev,<br>
>> + struct
amdgpu_vm *vm, uint64_t start,
uint64_t end)<br>
>> +{<!-- --><br>
>> + struct
amdgpu_vm_pt_cursor cursor;<br>
>> + unsigned shift,
num_entries;<br>
>> +<br>
>> +
amdgpu_vm_pt_start(adev, vm,
start, &cursor);<br>
>> + while
(cursor.level <
AMDGPU_VM_PTB) {<!-- --><br>
>> + if
(!amdgpu_vm_pt_descendant(adev,
&cursor))<br>
>> + return
-ENOENT;<br>
>> + }<br>
>> +<br>
>> + while
(cursor.pfn < end) {<!-- --><br>
>> +
amdgpu_vm_free_table(cursor.entry);<br>
>> + num_entries
= amdgpu_vm_num_entries(adev,
cursor.level - 1);<br>
>> +<br>
>> + if
(cursor.entry !=
&cursor.parent->entries[num_entries
- 1]) {<!-- --><br>
>> + /* Next
ptb entry */<br>
>> + shift =
amdgpu_vm_level_shift(adev,
cursor.level - 1);<br>
>> +
cursor.pfn += 1ULL <<
shift;<br>
>> +
cursor.pfn &= ~((1ULL
<< shift) - 1);<br>
>> +
cursor.entry++;<br>
>> + } else {<!-- --><br>
>> + /* Next
ptb entry in next pd0 entry */<br>
>> +
amdgpu_vm_pt_ancestor(&cursor);<br>
>> + shift =
amdgpu_vm_level_shift(adev,
cursor.level - 1);<br>
>> +
cursor.pfn += 1ULL <<
shift;<br>
>> +
cursor.pfn &= ~((1ULL
<< shift) - 1);<br>
>> +
amdgpu_vm_pt_descendant(adev,
&cursor);<br>
>> + }<br>
>> + }<br>
>> +<br>
>> + return 0;<br>
>> +}<br>
>> +<br>
>> +/**<br>
>> *
amdgpu_vm_clear_freed - clear
freed BOs in the PT<br>
>> *<br>
>> * @adev:
amdgpu_device pointer<br>
>> @@ -1949,7 +1994,6 @@
int
amdgpu_vm_clear_freed(struct
amdgpu_device <br>
>> *adev,<br>
>>
struct dma_fence **fence)<br>
>> {<!-- --><br>
>> struct
amdgpu_bo_va_mapping *mapping;<br>
>> - uint64_t
init_pte_value = 0;<br>
>> struct
dma_fence *f = NULL;<br>
>> int r;<br>
>> @@ -1958,13
+2002,10 @@ int
amdgpu_vm_clear_freed(struct <br>
>> amdgpu_device *adev,<br>
>> struct
amdgpu_bo_va_mapping, list);<br>
>>
list_del(&mapping->list);<br>
>> - if
(vm->pte_support_ats
&&<br>
>> -
mapping->start <
AMDGPU_GMC_HOLE_START)<br>
>> -
init_pte_value =
AMDGPU_PTE_DEFAULT_ATC;<br>
>> + r =
amdgpu_vm_remove_ptes(adev,
vm,<br>
>> +
(mapping->start + 0x1ff)
& (~0x1ffll),<br>
>> +
(mapping->last + 1) &
(~0x1ffll));<br>
>> - r =
amdgpu_vm_bo_update_mapping(adev,
vm, false, NULL,<br>
>>
-
mapping->start,
mapping->last,<br>
>>
-
init_pte_value, 0, NULL,
&f);<br>
>>
amdgpu_vm_free_mapping(adev,
vm, mapping, f);<br>
>> if (r) {<!-- --><br>
>>
dma_fence_put(f);<br>
>> @@ -1980,7 +2021,6 @@
int
amdgpu_vm_clear_freed(struct
amdgpu_device <br>
>> *adev,<br>
>> }<br>
>> return 0;<br>
>> -<br>
>> }<br>
>> /**<br>
><br>
>
_______________________________________________<br>
> amd-gfx mailing list<br>
> <a
href="mailto:amd-gfx@lists.freedesktop.org"
moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
> <a
href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx"
moz-do-not-send="true">
https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><br>
</div>
</span></font></div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</blockquote>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a></pre>
</blockquote>
<br>
</body>
</html>