<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body text="#000000" bgcolor="#FFFFFF"> <div class="moz-cite-prefix">On 2018年03月28日 16:13, zhoucm1 wrote: </div> <blockquote type="cite" cite="mid:192b6077-88e7-f0f0-6923-b91e3ad5e97b@amd.com"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <div class="moz-cite-prefix">On 2018年03月27日 21:44, Christian König wrote: </div> <blockquote type="cite" cite="mid:a9d42a8a-b9b5-1d89-f95e-e678829a8260@amd.com"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <div class="moz-cite-prefix">How about we update the LRU only when we need to re-validate at least one BO? </div> </blockquote> I tried this just now, performance still isn't stable, sometime drop to 28fps by accident. I also tried to check num_evictions, if eviction happens, then update LRU, also sometime drop to 28fps by accident. When BOs change, we not only need keep LRU order, but also validation order in vm->evicted list. Any other ideas which can keep these order but not increase submission overhead? </blockquote> With more thinking, we need to add new LRU design for per vm bo, we need to make sure the order when adding to LRU. How about the below idea: 0. separate traditional bo list lru and per-vm-bo lru. Traditional lru keeps old way, per-vm-lru follows below design. 1. TTM bdev maintains a vm/process list. 2. Every vm_list node contains its own per-vm-bo LRU[priority] 3. To manage the vm_list lru in specific driver, we will need add callback for it. 4. We will add an order for every per-vm-bo in that vm/process. 5. To speed up per-vm-lru sort, we will introduce RB tree for it in callback. The RB tree key is order. This way, we will be able to keep the per-vm-bo LRU order. What do you think of it? Regards, David Zhou <blockquote type="cite" cite="mid:192b6077-88e7-f0f0-6923-b91e3ad5e97b@amd.com"> Regards, David Zhou <blockquote type="cite" cite="mid:a9d42a8a-b9b5-1d89-f95e-e678829a8260@amd.com"> <div class="moz-cite-prefix"> BTW: We can easily walk all BOs which belong to a VM, skipping over the few which aren't per VM BOs should be trivial. Christian. Am 27.03.2018 um 13:56 schrieb Zhou, David(ChunMing): </div> <blockquote type="cite" cite="mid:smartisan1522151762126"> <meta name="Generator" content="Microsoft Exchange Server">  <style></style> <div> then how to keep unique lru order? any ideas? To stable performance, we have to keep unique lru order, otherwise like the issue I look into, sometimes F1game is 40fps, sometimes 28fps...even re-validate allowed domains BO. The left root cause is the moved BOs are not same. send from Smartisan Pro <style type="text/css">  </style> <div class="x_quote"> <div style="margin:0 0px; font-size:105%">Christian K鰊ig <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com" moz-do-not-send="true"><ckoenig.leichtzumerken@gmail.com></a> 于 2018年3月27日下午6:50写道：</div> </div> </div> <div class="PlainText">NAK, we already tried that and it is really not a good idea because it massively increases the per submission overhead. Christian. Am 27.03.2018 um 12:16 schrieb Chunming Zhou: > Change-Id: Ibad84ed585b0746867a5f4cd1eadc2273e7cf596 > Signed-off-by: Chunming Zhou <a class="moz-txt-link-rfc2396E" href="mailto:david1.zhou@amd.com" moz-do-not-send="true"><david1.zhou@amd.com></a> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 15 +++++++++++++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 1 + > 3 files changed, 18 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index 383bf2d31c92..414e61799236 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -919,6 +919,8 @@ static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p) > } > } > > + amdgpu_vm_refresh_lru(adev, vm); > + > return r; > } > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 5e35e23511cf..8ad2bb705765 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -1902,6 +1902,21 @@ struct amdgpu_bo_va *amdgpu_vm_bo_add(struct amdgpu_device *adev, > return bo_va; > } > > +void amdgpu_vm_refresh_lru(struct amdgpu_device *adev, struct amdgpu_vm *vm) > +{ > + struct ttm_bo_global *glob = adev->mman.bdev.glob; > + struct amdgpu_vm_bo_base *bo_base; > + > + spin_lock(&vm->status_lock); > + list_for_each_entry(bo_base, &vm->vm_bo_list, vm_bo) { > + spin_lock(&glob->lru_lock); > + ttm_bo_move_to_lru_tail(&bo_base->bo->tbo); > + if (bo_base->bo->shadow) > + ttm_bo_move_to_lru_tail(&bo_base->bo->shadow->tbo); > + spin_unlock(&glob->lru_lock); > + } > + spin_unlock(&vm->status_lock); > +} > > /** > * amdgpu_vm_bo_insert_mapping - insert a new mapping > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > index 1886a561c84e..e01895581489 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > @@ -285,6 +285,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev, > struct dma_fence **fence); > int amdgpu_vm_handle_moved(struct amdgpu_device *adev, > struct amdgpu_vm *vm); > +void amdgpu_vm_refresh_lru(struct amdgpu_device *adev, struct amdgpu_vm *vm); > int amdgpu_vm_bo_update(struct amdgpu_device *adev, > struct amdgpu_bo_va *bo_va, > bool clear); </div> </blockquote> </blockquote> </blockquote> </body> </html>