<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Am 29.03.2018 um 10:37 schrieb zhoucm1:<br>
</div>
<blockquote type="cite"
cite="mid:14672700-82d7-c9d8-1086-84a4d8a711bc@amd.com">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<p><br>
</p>
<br>
<div class="moz-cite-prefix">On 2018年03月28日 16:13, zhoucm1 wrote:<br>
</div>
<blockquote type="cite"
cite="mid:192b6077-88e7-f0f0-6923-b91e3ad5e97b@amd.com">
<p><br>
</p>
<br>
<div class="moz-cite-prefix">On 2018年03月27日 21:44, Christian
König wrote:<font size="2"><span style="font-size:11pt;"><br>
<br>
</span></font></div>
<blockquote type="cite"
cite="mid:a9d42a8a-b9b5-1d89-f95e-e678829a8260@amd.com">
<div class="moz-cite-prefix">How about we update the LRU only
when we need to re-validate at least one BO?<br>
</div>
</blockquote>
I tried this just now, performance still isn't stable, sometime
drop to 28fps by accident.<br>
</blockquote>
</blockquote>
<br>
Can you give me the code for that? I probably can't work this week
on that, but I can take a look next week.<br>
<br>
<blockquote type="cite"
cite="mid:14672700-82d7-c9d8-1086-84a4d8a711bc@amd.com">
<blockquote type="cite"
cite="mid:192b6077-88e7-f0f0-6923-b91e3ad5e97b@amd.com"> <br>
I also tried to check num_evictions, if eviction happens, then
update LRU, also sometime drop to 28fps by accident.<br>
<br>
When BOs change, we not only need keep LRU order, but also
validation order in vm->evicted list. Any other ideas which
can keep these order but not increase submission overhead?<br>
</blockquote>
<br>
With more thinking, we need to add new LRU design for per vm bo,
we need to make sure the order when adding to LRU. How about the
below idea:<br>
0. separate traditional bo list lru and per-vm-bo lru. Traditional
lru keeps old way, per-vm-lru follows below design.<br>
1. TTM bdev maintains a vm/process list.<br>
2. Every vm_list node contains its own per-vm-bo LRU[priority]<br>
3. To manage the vm_list lru in specific driver, we will need add
callback for it.<br>
4. We will add an order for every per-vm-bo in that vm/process.<br>
5. To speed up per-vm-lru sort, we will introduce RB tree for it
in callback. The RB tree key is order.<br>
<br>
This way, we will be able to keep the per-vm-bo LRU order.<br>
<br>
What do you think of it?<br>
</blockquote>
<br>
No, we need a single LRU for per VM and not per VM BOs to maintain
eviction fairness, so we don't really win anything with that.<br>
<br>
Regards,<br>
Christian.<br>
<br>
<blockquote type="cite"
cite="mid:14672700-82d7-c9d8-1086-84a4d8a711bc@amd.com"> <br>
Regards,<br>
David Zhou<br>
<blockquote type="cite"
cite="mid:192b6077-88e7-f0f0-6923-b91e3ad5e97b@amd.com"> <br>
Regards,<br>
David Zhou<br>
<blockquote type="cite"
cite="mid:a9d42a8a-b9b5-1d89-f95e-e678829a8260@amd.com">
<div class="moz-cite-prefix"> <br>
BTW: We can easily walk all BOs which belong to a VM,
skipping over the few which aren't per VM BOs should be
trivial.<br>
<br>
Christian.<br>
<br>
Am 27.03.2018 um 13:56 schrieb Zhou, David(ChunMing):<br>
</div>
<blockquote type="cite" cite="mid:smartisan1522151762126">
<meta name="Generator" content="Microsoft Exchange Server">
<!-- converted from text -->
<style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
<div> <font color="#333333">then how to keep unique lru
order? any ideas?<br>
</font><br>
To stable performance, we have to keep unique lru order,
otherwise like the issue I look into, sometimes F1game is
40fps, sometimes 28fps...even re-validate allowed domains
BO.<br>
<br>
The left root cause is the moved BOs are not same.<br>
<br>
<span id="x_smartisan_signature" style="font-size:0.8em;
display:inline; color:#888888">
<p dir="ltr">send from Smartisan Pro</p>
</span>
<style type="text/css">
<!--
* body
{padding:0 16px 30px!important;
margin:0!important;
background-color:#ffffff;
line-height:1.4;
word-wrap:break-word;
word-break:normal}
div
{word-wrap:break-word;
word-break:normal}
p
{word-wrap:break-word;
word-break:normal;
text-indent:0pt!important}
span
{word-wrap:break-word;
word-break:normal}
a
{word-wrap:break-word;
word-break:normal}
td
{word-wrap:break-word;
word-break:break-all}
-->
</style>
<div class="x_quote">
<div style="margin:0 0px; font-size:105%"><font
style="line-height:1.4" color="#629140"><span>Christian
K鰊ig <a class="moz-txt-link-rfc2396E"
href="mailto:ckoenig.leichtzumerken@gmail.com"
moz-do-not-send="true"><ckoenig.leichtzumerken@gmail.com></a>
于 2018年3月27日 下午6:50写道:</span></font></div>
<br type="attribution">
</div>
</div>
<font size="2"><span style="font-size:11pt;">
<div class="PlainText">NAK, we already tried that and it
is really not a good idea because it <br>
massively increases the per submission overhead.<br>
<br>
Christian.<br>
<br>
Am 27.03.2018 um 12:16 schrieb Chunming Zhou:<br>
> Change-Id:
Ibad84ed585b0746867a5f4cd1eadc2273e7cf596<br>
> Signed-off-by: Chunming Zhou <a
class="moz-txt-link-rfc2396E"
href="mailto:david1.zhou@amd.com"
moz-do-not-send="true"><david1.zhou@amd.com></a><br>
> ---<br>
> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++<br>
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 15
+++++++++++++++<br>
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 1 +<br>
> 3 files changed, 18 insertions(+)<br>
><br>
> diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
> index 383bf2d31c92..414e61799236 100644<br>
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
> @@ -919,6 +919,8 @@ static int
amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p)<br>
> }<br>
> }<br>
> <br>
> + amdgpu_vm_refresh_lru(adev, vm);<br>
> +<br>
> return r;<br>
> }<br>
> <br>
> diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
> index 5e35e23511cf..8ad2bb705765 100644<br>
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
> @@ -1902,6 +1902,21 @@ struct amdgpu_bo_va
*amdgpu_vm_bo_add(struct amdgpu_device *adev,<br>
> return bo_va;<br>
> }<br>
> <br>
> +void amdgpu_vm_refresh_lru(struct amdgpu_device
*adev, struct amdgpu_vm *vm)<br>
> +{<br>
> + struct ttm_bo_global *glob =
adev->mman.bdev.glob;<br>
> + struct amdgpu_vm_bo_base *bo_base;<br>
> +<br>
> + spin_lock(&vm->status_lock);<br>
> + list_for_each_entry(bo_base,
&vm->vm_bo_list, vm_bo) {<br>
> + spin_lock(&glob->lru_lock);<br>
> +
ttm_bo_move_to_lru_tail(&bo_base->bo->tbo);<br>
> + if (bo_base->bo->shadow)<br>
> +
ttm_bo_move_to_lru_tail(&bo_base->bo->shadow->tbo);<br>
> +
spin_unlock(&glob->lru_lock);<br>
> + }<br>
> + spin_unlock(&vm->status_lock);<br>
> +}<br>
> <br>
> /**<br>
> * amdgpu_vm_bo_insert_mapping - insert a new
mapping<br>
> diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h<br>
> index 1886a561c84e..e01895581489 100644<br>
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h<br>
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h<br>
> @@ -285,6 +285,7 @@ int
amdgpu_vm_clear_freed(struct amdgpu_device *adev,<br>
> struct dma_fence
**fence);<br>
> int amdgpu_vm_handle_moved(struct amdgpu_device
*adev,<br>
> struct amdgpu_vm *vm);<br>
> +void amdgpu_vm_refresh_lru(struct amdgpu_device
*adev, struct amdgpu_vm *vm);<br>
> int amdgpu_vm_bo_update(struct amdgpu_device
*adev,<br>
> struct amdgpu_bo_va
*bo_va,<br>
> bool clear);<br>
<br>
</div>
</span></font> </blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>