<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p><br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 2018年03月28日 16:13, zhoucm1 wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:192b6077-88e7-f0f0-6923-b91e3ad5e97b@amd.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <p><br>
      </p>
      <br>
      <div class="moz-cite-prefix">On 2018年03月27日 21:44, Christian König
        wrote:<font size="2"><span style="font-size:11pt;"><br>
            <br>
          </span></font></div>
      <blockquote type="cite"
        cite="mid:a9d42a8a-b9b5-1d89-f95e-e678829a8260@amd.com">
        <meta http-equiv="Content-Type" content="text/html;
          charset=utf-8">
        <div class="moz-cite-prefix">How about we update the LRU only
          when we need to re-validate at least one BO?<br>
        </div>
      </blockquote>
      I tried this just now, performance still isn't stable, sometime
      drop to 28fps by accident.<br>
      <br>
      I also tried to check num_evictions, if eviction happens, then
      update LRU, also sometime drop to 28fps by accident.<br>
      <br>
      When BOs change, we not only need keep LRU order, but also
      validation order in vm->evicted list. Any other ideas which can
      keep these order but not increase submission overhead?<br>
    </blockquote>
    <br>
    With more thinking, we need to add new LRU design for per vm bo, we
    need to make sure the order when adding to LRU. How about the below
    idea:<br>
    0. separate traditional bo list lru and per-vm-bo lru. Traditional
    lru keeps old way, per-vm-lru follows below design.<br>
    1. TTM bdev maintains a vm/process list.<br>
    2. Every vm_list node contains its own per-vm-bo LRU[priority]<br>
    3. To manage the vm_list lru in specific driver, we will need add
    callback for it.<br>
    4. We will add an order for every per-vm-bo in that vm/process.<br>
    5. To speed up per-vm-lru sort, we will introduce RB tree for it in
    callback. The RB tree key is order.<br>
    <br>
    This way, we will be able to keep the per-vm-bo LRU order.<br>
    <br>
    What do you think of it?<br>
    <br>
    Regards,<br>
    David Zhou<br>
    <blockquote type="cite"
      cite="mid:192b6077-88e7-f0f0-6923-b91e3ad5e97b@amd.com"> <br>
      Regards,<br>
      David Zhou<br>
      <blockquote type="cite"
        cite="mid:a9d42a8a-b9b5-1d89-f95e-e678829a8260@amd.com">
        <div class="moz-cite-prefix"> <br>
          BTW: We can easily walk all BOs which belong to a VM, skipping
          over the few which aren't per VM BOs should be trivial.<br>
          <br>
          Christian.<br>
          <br>
          Am 27.03.2018 um 13:56 schrieb Zhou, David(ChunMing):<br>
        </div>
        <blockquote type="cite" cite="mid:smartisan1522151762126">
          <meta name="Generator" content="Microsoft Exchange Server">
          <!-- converted from text -->
          <style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
          <div> <font color="#333333">then how to keep unique lru
              order? any ideas?<br>
            </font><br>
            To stable performance, we have to keep unique lru order,
            otherwise like the issue I look into, sometimes F1game is
            40fps, sometimes 28fps...even re-validate allowed domains
            BO.<br>
            <br>
            The left root cause is the moved BOs are not same.<br>
            <br>
            <span id="x_smartisan_signature" style="font-size:0.8em;
              display:inline; color:#888888">
              <p dir="ltr">send from Smartisan Pro</p>
            </span>
            <style type="text/css">
<!--
* body
        {padding:0 16px 30px!important;
        margin:0!important;
        background-color:#ffffff;
        line-height:1.4;
        word-wrap:break-word;
        word-break:normal}
div
        {word-wrap:break-word;
        word-break:normal}
p
        {word-wrap:break-word;
        word-break:normal;
        text-indent:0pt!important}
span
        {word-wrap:break-word;
        word-break:normal}
a
        {word-wrap:break-word;
        word-break:normal}
td
        {word-wrap:break-word;
        word-break:break-all}
-->
</style>
            <div class="x_quote">
              <div style="margin:0 0px; font-size:105%"><font
                  style="line-height:1.4" color="#629140"><span>Christian
                    K鰊ig <a class="moz-txt-link-rfc2396E"
                      href="mailto:ckoenig.leichtzumerken@gmail.com"
                      moz-do-not-send="true"><ckoenig.leichtzumerken@gmail.com></a>
                    于 2018年3月27日 下午6:50写道:</span></font></div>
              <br type="attribution">
            </div>
          </div>
          <font size="2"><span style="font-size:11pt;">
              <div class="PlainText">NAK, we already tried that and it
                is really not a good idea because it <br>
                massively increases the per submission overhead.<br>
                <br>
                Christian.<br>
                <br>
                Am 27.03.2018 um 12:16 schrieb Chunming Zhou:<br>
                > Change-Id:
                Ibad84ed585b0746867a5f4cd1eadc2273e7cf596<br>
                > Signed-off-by: Chunming Zhou <a
                  class="moz-txt-link-rfc2396E"
                  href="mailto:david1.zhou@amd.com"
                  moz-do-not-send="true"><david1.zhou@amd.com></a><br>
                > ---<br>
                >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  2 ++<br>
                >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 15
                +++++++++++++++<br>
                >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  1 +<br>
                >   3 files changed, 18 insertions(+)<br>
                ><br>
                > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
                b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
                > index 383bf2d31c92..414e61799236 100644<br>
                > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
                > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
                > @@ -919,6 +919,8 @@ static int
                amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p)<br>
                >                }<br>
                >        }<br>
                >   <br>
                > +     amdgpu_vm_refresh_lru(adev, vm);<br>
                > +<br>
                >        return r;<br>
                >   }<br>
                >   <br>
                > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
                b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                > index 5e35e23511cf..8ad2bb705765 100644<br>
                > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                > @@ -1902,6 +1902,21 @@ struct amdgpu_bo_va
                *amdgpu_vm_bo_add(struct amdgpu_device *adev,<br>
                >        return bo_va;<br>
                >   }<br>
                >   <br>
                > +void amdgpu_vm_refresh_lru(struct amdgpu_device
                *adev, struct amdgpu_vm *vm)<br>
                > +{<br>
                > +     struct ttm_bo_global *glob =
                adev->mman.bdev.glob;<br>
                > +     struct amdgpu_vm_bo_base *bo_base;<br>
                > +<br>
                > +     spin_lock(&vm->status_lock);<br>
                > +     list_for_each_entry(bo_base,
                &vm->vm_bo_list, vm_bo) {<br>
                > +             spin_lock(&glob->lru_lock);<br>
                > +            
                ttm_bo_move_to_lru_tail(&bo_base->bo->tbo);<br>
                > +             if (bo_base->bo->shadow)<br>
                > +                    
                ttm_bo_move_to_lru_tail(&bo_base->bo->shadow->tbo);<br>
                > +             spin_unlock(&glob->lru_lock);<br>
                > +     }<br>
                > +     spin_unlock(&vm->status_lock);<br>
                > +}<br>
                >   <br>
                >   /**<br>
                >    * amdgpu_vm_bo_insert_mapping - insert a new
                mapping<br>
                > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
                b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h<br>
                > index 1886a561c84e..e01895581489 100644<br>
                > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h<br>
                > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h<br>
                > @@ -285,6 +285,7 @@ int
                amdgpu_vm_clear_freed(struct amdgpu_device *adev,<br>
                >                          struct dma_fence **fence);<br>
                >   int amdgpu_vm_handle_moved(struct amdgpu_device
                *adev,<br>
                >                           struct amdgpu_vm *vm);<br>
                > +void amdgpu_vm_refresh_lru(struct amdgpu_device
                *adev, struct amdgpu_vm *vm);<br>
                >   int amdgpu_vm_bo_update(struct amdgpu_device
                *adev,<br>
                >                        struct amdgpu_bo_va *bo_va,<br>
                >                        bool clear);<br>
                <br>
              </div>
            </span></font> </blockquote>
        <br>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>