<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Yes, exactly that's the problem.<br>
      <br>
      See when you want to prevent a process B from allocating the
      memory process A has evicted, you need to prevent all concurrent
      allocation.<br>
      <br>
      And we don't do that because it causes a major performance drop.<br>
      <br>
      Regards,<br>
      Christian.<br>
      <br>
      Am 26.01.2018 um 14:21 schrieb Zhou, David(ChunMing):<br>
    </div>
    <blockquote type="cite"
      cite="mid:5a6b2b4c.c441650a.c40de.5db1SMTPIN_ADDED_BROKEN@mx.google.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <meta content="text/html; charset=utf-8">
      <div name="smartisanmessageid" id="smartisan1516972869817"><font
          color="#333333">You patch will prevent concurrent allocation,
          and will result in allocation performance drop much.</font><br>
        <br>
        <span id="smartisan_signature" style="font-size:0.8em;
          display:inline; color:#888888">
          <p dir="ltr">发自坚果 Pro</p>
        </span>
        <style type="text/css">
<!--
* body
        {padding:0 16px 30px!important;
        margin:0!important;
        background-color:#ffffff;
        line-height:1.4;
        word-wrap:break-word;
        word-break:normal}
div
        {word-wrap:break-word;
        word-break:normal}
p
        {word-wrap:break-word;
        word-break:normal;
        text-indent:0pt!important}
span
        {word-wrap:break-word;
        word-break:normal}
a
        {word-wrap:break-word;
        word-break:normal}
td
        {word-wrap:break-word;
        word-break:break-all}
-->
</style>
        <div class="quote">
          <div style="margin:0 0px; font-size:105%"><font
              style="line-height:1.4" color="#629140"><span>Christian
                K鰊ig <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a> 于
                2018年1月26日 下午9:04写道:</span></font></div>
          <br type="attribution">
        </div>
      </div>
      <div>
        <div class="moz-cite-prefix">Attached is what you actually want
          to do cleanly implemented. But as I said this is a NO-GO.<br>
          <br>
          Regards,<br>
          Christian.<br>
          <br>
          Am 26.01.2018 um 13:43 schrieb Christian König:<br>
        </div>
        <blockquote type="cite">
          <div class="moz-cite-prefix">
            <blockquote type="cite">After my investigation, this issue
              should be detect of TTM design self, which breaks
              scheduling balance.<br>
            </blockquote>
            Yeah, but again. This is indented design we can't change
            easily.<br>
            <br>
            Regards,<br>
            Christian.<br>
            <br>
            Am 26.01.2018 um 13:36 schrieb Zhou, David(ChunMing):<br>
          </div>
          <blockquote type="cite">
            <meta name="Generator" content="Microsoft Exchange Server">
            <style>
<!--
.EmailQuote
        {margin-left:1pt;
        padding-left:4pt;
        border-left:#800000 2px solid}
-->
</style>
            <div>
              <div name="x_smartisanmessageid"
                id="x_smartisan1516970199102"><font color="#333333">I am
                  off work, so reply mail by phone, the format could not
                  be text.<br>
                </font><br>
                back to topic itself:<br>
                the problem indeed happen on amdgpu driver, someone
                reports me that application runs with two instances, the
                performance are different.<br>
                I also reproduced the issue with unit
                test(bo_eviction_test). They always think our scheduler
                isn't working as expected.<br>
                <br>
                After my investigation, this issue should be detect of
                TTM design self, which breaks scheduling balance.<br>
                <br>
                Further, if we run containers for our gpu, container A
                could run high score, container B runs low score with
                same benchmark.<br>
                <br>
                So this is bug that we need fix.<br>
                <br>
                Regards,<br>
                David Zhou<br>
                <br>
                <span id="x_smartisan_signature" style="font-size:0.8em;
                  display:inline; color:#888888">
                  <p dir="ltr">发自坚果 Pro</p>
                </span>
                <style type="text/css">
<!--
* body
        {padding:0 16px 30px!important;
        margin:0!important;
        background-color:#ffffff;
        line-height:1.4;
        word-wrap:break-word;
        word-break:normal}
div
        {word-wrap:break-word;
        word-break:normal}
p
        {word-wrap:break-word;
        word-break:normal;
        text-indent:0pt!important}
span
        {word-wrap:break-word;
        word-break:normal}
a
        {word-wrap:break-word;
        word-break:normal}
td
        {word-wrap:break-word;
        word-break:break-all}
-->
</style>
                <div class="x_quote">
                  <div style="margin:0 0px; font-size:105%"><font
                      style="line-height:1.4" color="#629140"><span>Christian
                        K鰊ig
                        <a class="moz-txt-link-rfc2396E"
                          href="mailto:ckoenig.leichtzumerken@gmail.com"
                          moz-do-not-send="true"><ckoenig.leichtzumerken@gmail.com></a>
                        于 2018年1月26日 下午6:31写道:</span></font></div>
                  <br type="attribution">
                </div>
              </div>
            </div>
            <font size="2"><span style="font-size:11pt">
                <div class="PlainText">Am 26.01.2018 um 11:22 schrieb
                  Chunming Zhou:<br>
                  > there is a scheduling balance issue about get
                  node like:<br>
                  > a. process A allocates full memory and use it for
                  submission.<br>
                  > b. process B tries to allocates memory, will wait
                  for process A BO idle in eviction.<br>
                  > c. process A completes the job, process B
                  eviction will put process A BO node,<br>
                  > but in the meantime, process C is comming to
                  allocate BO, whill directly get node successfully, and
                  do submission,<br>
                  > process B will again wait for process C BO idle.<br>
                  > d. repeat the above setps, process B could be
                  delayed much more.<br>
                  ><br>
                  > later allocation must not be ahead of front in
                  same place.<br>
                  <br>
                  Again NAK to the whole approach.<br>
                  <br>
                  At least with amdgpu the problem you described above
                  never occurs <br>
                  because evictions are pipelined operations. We could
                  only block for <br>
                  deleted regions to become free.<br>
                  <br>
                  But independent of that incoming memory requests while
                  we make room for <br>
                  eviction are intended to be served first.<br>
                  <br>
                  Changing that is certainly a no-go cause that would
                  favor memory hungry <br>
                  applications over small clients.<br>
                  <br>
                  Regards,<br>
                  Christian.<br>
                  <br>
                  ><br>
                  > Change-Id:
                  I3daa892e50f82226c552cc008a29e55894a98f18<br>
                  > Signed-off-by: Chunming Zhou <a
                    class="moz-txt-link-rfc2396E"
                    href="mailto:david1.zhou@amd.com"
                    moz-do-not-send="true">
                    <david1.zhou@amd.com></a><br>
                  > ---<br>
                  >   drivers/gpu/drm/ttm/ttm_bo.c    | 69
                  +++++++++++++++++++++++++++++++++++++++--<br>
                  >   include/drm/ttm/ttm_bo_api.h    |  7 +++++<br>
                  >   include/drm/ttm/ttm_bo_driver.h |  7 +++++<br>
                  >   3 files changed, 80 insertions(+), 3
                  deletions(-)<br>
                  ><br>
                  > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
                  b/drivers/gpu/drm/ttm/ttm_bo.c<br>
                  > index d33a6bb742a1..558ec2cf465d 100644<br>
                  > --- a/drivers/gpu/drm/ttm/ttm_bo.c<br>
                  > +++ b/drivers/gpu/drm/ttm/ttm_bo.c<br>
                  > @@ -841,6 +841,58 @@ static int
                  ttm_bo_add_move_fence(struct ttm_buffer_object *bo,<br>
                  >        return 0;<br>
                  >   }<br>
                  >   <br>
                  > +static void ttm_man_init_waiter(struct
                  ttm_bo_waiter *waiter,<br>
                  > +                             struct
                  ttm_buffer_object *bo,<br>
                  > +                             const struct
                  ttm_place *place)<br>
                  > +{<br>
                  > +     waiter->tbo = bo;<br>
                  > +     memcpy((void *)&waiter->place, (void
                  *)place, sizeof(*place));<br>
                  > +     INIT_LIST_HEAD(&waiter->list);<br>
                  > +}<br>
                  > +<br>
                  > +static void ttm_man_add_waiter(struct
                  ttm_mem_type_manager *man,<br>
                  > +                            struct ttm_bo_waiter
                  *waiter)<br>
                  > +{<br>
                  > +     if (!waiter)<br>
                  > +             return;<br>
                  > +     spin_lock(&man->wait_lock);<br>
                  > +     list_add_tail(&waiter->list,
                  &man->waiter_list);<br>
                  > +     spin_unlock(&man->wait_lock);<br>
                  > +}<br>
                  > +<br>
                  > +static void ttm_man_del_waiter(struct
                  ttm_mem_type_manager *man,<br>
                  > +                            struct ttm_bo_waiter
                  *waiter)<br>
                  > +{<br>
                  > +     if (!waiter)<br>
                  > +             return;<br>
                  > +     spin_lock(&man->wait_lock);<br>
                  > +     if (!list_empty(&waiter->list))<br>
                  > +             list_del(&waiter->list);<br>
                  > +     spin_unlock(&man->wait_lock);<br>
                  > +     kfree(waiter);<br>
                  > +}<br>
                  > +<br>
                  > +int ttm_man_check_bo(struct ttm_mem_type_manager
                  *man,<br>
                  > +                  struct ttm_buffer_object *bo,<br>
                  > +                  const struct ttm_place *place)<br>
                  > +{<br>
                  > +     struct ttm_bo_waiter *waiter, *tmp;<br>
                  > +<br>
                  > +     spin_lock(&man->wait_lock);<br>
                  > +     list_for_each_entry_safe(waiter, tmp,
                  &man->waiter_list, list) {<br>
                  > +             if ((bo != waiter->tbo)
                  &&<br>
                  > +                 ((place->fpfn >=
                  waiter->place.fpfn &&<br>
                  > +                   place->fpfn <=
                  waiter->place.lpfn) ||<br>
                  > +                  (place->lpfn <=
                  waiter->place.lpfn && place->lpfn >=<br>
                  > +                   waiter->place.fpfn)))<br>
                  > +                 goto later_bo;<br>
                  > +     }<br>
                  > +     spin_unlock(&man->wait_lock);<br>
                  > +     return true;<br>
                  > +later_bo:<br>
                  > +     spin_unlock(&man->wait_lock);<br>
                  > +     return false;<br>
                  > +}<br>
                  >   /**<br>
                  >    * Repeatedly evict memory from the LRU for
                  @mem_type until we create enough<br>
                  >    * space, or we've evicted everything and there
                  isn't enough space.<br>
                  > @@ -853,17 +905,26 @@ static int
                  ttm_bo_mem_force_space(struct ttm_buffer_object *bo,<br>
                  >   {<br>
                  >        struct ttm_bo_device *bdev = bo->bdev;<br>
                  >        struct ttm_mem_type_manager *man =
                  &bdev->man[mem_type];<br>
                  > +     struct ttm_bo_waiter waiter;<br>
                  >        int ret;<br>
                  >   <br>
                  > +     ttm_man_init_waiter(&waiter, bo,
                  place);<br>
                  > +     ttm_man_add_waiter(man, &waiter);<br>
                  >        do {<br>
                  >                ret =
                  (*man->func->get_node)(man, bo, place, mem);<br>
                  > -             if (unlikely(ret != 0))<br>
                  > +             if (unlikely(ret != 0)) {<br>
                  > +                     ttm_man_del_waiter(man,
                  &waiter);<br>
                  >                        return ret;<br>
                  > -             if (mem->mm_node)<br>
                  > +             }<br>
                  > +             if (mem->mm_node) {<br>
                  > +                     ttm_man_del_waiter(man,
                  &waiter);<br>
                  >                        break;<br>
                  > +             }<br>
                  >                ret = ttm_mem_evict_first(bdev,
                  mem_type, place, ctx);<br>
                  > -             if (unlikely(ret != 0))<br>
                  > +             if (unlikely(ret != 0)) {<br>
                  > +                     ttm_man_del_waiter(man,
                  &waiter);<br>
                  >                        return ret;<br>
                  > +             }<br>
                  >        } while (1);<br>
                  >        mem->mem_type = mem_type;<br>
                  >        return ttm_bo_add_move_fence(bo, man,
                  mem);<br>
                  > @@ -1450,6 +1511,8 @@ int ttm_bo_init_mm(struct
                  ttm_bo_device *bdev, unsigned type,<br>
                  >        man->use_io_reserve_lru = false;<br>
                  >        mutex_init(&man->io_reserve_mutex);<br>
                  >        spin_lock_init(&man->move_lock);<br>
                  > +     spin_lock_init(&man->wait_lock);<br>
                  > +     INIT_LIST_HEAD(&man->waiter_list);<br>
                  >       
                  INIT_LIST_HEAD(&man->io_reserve_lru);<br>
                  >   <br>
                  >        ret =
                  bdev->driver->init_mem_type(bdev, type, man);<br>
                  > diff --git a/include/drm/ttm/ttm_bo_api.h
                  b/include/drm/ttm/ttm_bo_api.h<br>
                  > index 2cd025c2abe7..0fce4dbd02e7 100644<br>
                  > --- a/include/drm/ttm/ttm_bo_api.h<br>
                  > +++ b/include/drm/ttm/ttm_bo_api.h<br>
                  > @@ -40,6 +40,7 @@<br>
                  >   #include <linux/mm.h><br>
                  >   #include <linux/bitmap.h><br>
                  >   #include <linux/reservation.h><br>
                  > +#include <drm/ttm/ttm_placement.h><br>
                  >   <br>
                  >   struct ttm_bo_device;<br>
                  >   <br>
                  > @@ -232,6 +233,12 @@ struct ttm_buffer_object {<br>
                  >        struct mutex wu_mutex;<br>
                  >   };<br>
                  >   <br>
                  > +struct ttm_bo_waiter {<br>
                  > +     struct ttm_buffer_object *tbo;<br>
                  > +     struct ttm_place place;<br>
                  > +     struct list_head list;<br>
                  > +};<br>
                  > +<br>
                  >   /**<br>
                  >    * struct ttm_bo_kmap_obj<br>
                  >    *<br>
                  > diff --git a/include/drm/ttm/ttm_bo_driver.h
                  b/include/drm/ttm/ttm_bo_driver.h<br>
                  > index 9b417eb2df20..dc6b8b4c9e06 100644<br>
                  > --- a/include/drm/ttm/ttm_bo_driver.h<br>
                  > +++ b/include/drm/ttm/ttm_bo_driver.h<br>
                  > @@ -293,6 +293,10 @@ struct ttm_mem_type_manager
                  {<br>
                  >        bool io_reserve_fastpath;<br>
                  >        spinlock_t move_lock;<br>
                  >   <br>
                  > +     /* waiters in list */<br>
                  > +     spinlock_t wait_lock;<br>
                  > +     struct list_head waiter_list;<br>
                  > +<br>
                  >        /*<br>
                  >         * Protected by @io_reserve_mutex:<br>
                  >         */<br>
                  > @@ -748,6 +752,9 @@ int ttm_bo_mem_space(struct
                  ttm_buffer_object *bo,<br>
                  >                     struct ttm_mem_reg *mem,<br>
                  >                     struct ttm_operation_ctx
                  *ctx);<br>
                  >   <br>
                  > +int ttm_man_check_bo(struct ttm_mem_type_manager
                  *man,<br>
                  > +                  struct ttm_buffer_object *bo,<br>
                  > +                  const struct ttm_place
                  *place);<br>
                  >   void ttm_bo_mem_put(struct ttm_buffer_object
                  *bo, struct ttm_mem_reg *mem);<br>
                  >   void ttm_bo_mem_put_locked(struct
                  ttm_buffer_object *bo,<br>
                  >                           struct ttm_mem_reg
                  *mem);<br>
                  <br>
                </div>
              </span></font><br>
            <fieldset class="mimeAttachmentHeader"></fieldset>
            <br>
            <pre>_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
          </blockquote>
          <br>
        </blockquote>
        <br>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
dri-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:dri-devel@lists.freedesktop.org">dri-devel@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/dri-devel">https://lists.freedesktop.org/mailman/listinfo/dri-devel</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>