<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">For B path, we take mm->mmap_sem,
      then call hmm_vma_get_pfns() or get_user_pages(). This is obvious.<br>
      <br>
      For A path, mmu notifier
mmu_notifier_invalidate_range_start()/mmu_notifier_invalidate_range_end()
      is called in many places, and the calling path is quit complicated
      inside mm, it's not obvious. I checked many of the them, for
      example:<br>
      <br>
      do_munmap()   <br>
        down_write(&mm->mmap_sem)<br>
        arch_unmap()<br>
          mpx_notify_unmap()...<br>
             zap_bt_entries_mapping()<br>
               zap_page_range()<br>
       up_write(&mm->mmap_sem)<br>
      <br>
      void zap_page_range(struct vm_area_struct *vma, unsigned long
      start,<br>
              unsigned long size)<br>
      {<br>
          struct mm_struct *mm = vma->vm_mm;<br>
          struct mmu_gather tlb;<br>
          unsigned long end = start + size;<br>
      <br>
          lru_add_drain();<br>
          tlb_gather_mmu(&tlb, mm, start, end);<br>
          update_hiwater_rss(mm);<br>
          mmu_notifier_invalidate_range_start(mm, start, end);<br>
          for ( ; vma && vma->vm_start < end; vma =
      vma->vm_next)<br>
              unmap_single_vma(&tlb, vma, start, end, NULL);<br>
          mmu_notifier_invalidate_range_end(mm, start, end);<br>
          tlb_finish_mmu(&tlb, start, end);<br>
      }<br>
       <br>
      So AFAIK it's okay without invalidate_range_end() callback. <br>
               <br>
      Regards,<br>
      Philip<br>
      <br>
      On 2018-09-28 01:25 AM, Koenig, Christian wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:8f9f5703-214f-488d-9cfe-ccc64e8cd009@email.android.com">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <meta content="text/html; charset=utf-8">
      <div dir="auto">No, that is incorrect as well :)
        <div dir="auto"><br>
        </div>
        <div dir="auto">The mmap_sem isn't necessary taken during page
          table updates.</div>
        <div dir="auto"><br>
        </div>
        <div dir="auto">What you could do is replace get_user_pages()
          directly with HMM. If I'm not completely mistaken that should
          work as expected.</div>
        <div dir="auto"><br>
        </div>
        <div dir="auto">Christian.</div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">Am 27.09.2018 22:18 schrieb "Yang,
          Philip" <a class="moz-txt-link-rfc2396E" href="mailto:Philip.Yang@amd.com"><Philip.Yang@amd.com></a>:<br type="attribution">
        </div>
      </div>
      <div>
        <div class="moz-cite-prefix">I was trying to understand the way
          how HMM handle this concurrent issue and how we handle it in
          amdgpu_ttm_tt_userptr_needs_pages() and 
          amdgpu_ttm_tt_affect_userptr(). HMM uses range->valid flag,
          we use gtt->mmu_invalidations and gtt->last_set_pages.
          Both use the same lock plus flag idea actually.<br>
          <br>
          Thanks for the information, now I understand fence
          ttm_eu_fence_buffer_objects() put to BOs will block CPU page
          table update. This is another side of this concurrent issue I
          didn't know.<br>
          <br>
          I had same worry that it has issue without
          invalidate_range_end() callback as the calling sequence Felix
          lists. Now I think it's fine after taking a look again today
          because of mm->mmap_sem usage, this is my understanding:<br>
          <br>
          A path:<br>
          <br>
          down_write(&mm->mmap_sem);<br>
          mmu_notifier_invalidate_range_start()<br>
              take_lock()<br>
              release_lock()<br>
          CPU page table update<br>
          mmu_notifier_invalidate_range_end()<br>
          up_write(&mm->mmap_sem);<br>
          <br>
          B path:<br>
          <br>
          again: <br>
          down_read(&mm->mmap_sem);<br>
          hmm_vma_get_pfns()<br>
          up_read(&mm->mmap_sem);<br>
          ....<br>
          ....<br>
          take_lock()<br>
          if (!hmm_vma_range_done()) {<br>
             release_lock()<br>
             goto again<br>
          }<br>
          submit command job...<br>
          release_lock()<br>
          <br>
          If you agree, I will submit patch v5 with some minor changes,
          and submit another patch to replace get_user_page() with HMM.<br>
           <br>
          Regards,<br>
          Philip  <br>
          <br>
          On 2018-09-27 11:36 AM, Christian König wrote:<br>
        </div>
        <blockquote type="cite">
          <div class="moz-cite-prefix">Yeah, I've read that as well.<br>
            <br>
            My best guess is that we just need to add a call to
            hmm_vma_range_done() after taking the lock and also replace
            get_user_pages() with hmm_vma_get_pfns().<br>
            <br>
            But I'm still not 100% sure how all of that is supposed to
            work together.<br>
            <br>
            Regards,<br>
            Christian.<br>
            <br>
            Am 27.09.2018 um 16:50 schrieb Kuehling, Felix:<br>
          </div>
          <blockquote type="cite">
            <meta name="Generator" content="Microsoft Word 15 (filtered
              medium)">
            <style>
<!--
@font-face
        {font-family:"Cambria Math"}
@font-face
        {font-family:Calibri}
@font-face
        {font-family:Consolas}
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
a:link, span.MsoHyperlink
        {color:blue;
        text-decoration:underline}
a:visited, span.MsoHyperlinkFollowed
        {color:purple;
        text-decoration:underline}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {margin-top:0in;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:.5in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
p.msonormal0, li.msonormal0, div.msonormal0
        {margin-right:0in;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif}
span.PlainTextChar
        {font-family:Consolas;
        color:black}
p.msonormal00, li.msonormal00, div.msonormal00
        {margin-right:0in;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif}
p.msonormal000, li.msonormal000, div.msonormal000
        {margin-right:0in;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
p.emailquote, li.emailquote, div.emailquote
        {margin-right:0in;
        margin-left:1.0pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
p.msochpdefault, li.msochpdefault, div.msochpdefault
        {margin-right:0in;
        margin-left:0in;
        font-size:10.0pt;
        font-family:"Calibri",sans-serif}
span.plaintextchar0
        {font-family:Consolas;
        color:black}
span.plaintextchar00
        {font-family:"Calibri",sans-serif}
span.emailstyle22
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.emailstyle23
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.emailstyle24
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.emailstyle25
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.emailstyle30
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.EmailStyle33
        {font-family:"Calibri",sans-serif;
        color:windowtext}
.MsoChpDefault
        {font-size:10.0pt}
@page WordSection1
        {margin:1.0in 1.0in 1.0in 1.0in}
div.WordSection1
        {}
ol
        {margin-bottom:0in}
ul
        {margin-bottom:0in}
-->
</style>
            <div class="WordSection1">
              <p class="MsoNormal"><span style="color:windowtext">I
                  think the answer is here: <a
href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216"
                    moz-do-not-send="true">
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216</a></span></p>
              <p class="MsoNormal"><span style="color:windowtext"> </span></p>
              <p class="MsoNormal"><span style="color:windowtext">Regards,</span></p>
              <p class="MsoNormal"><span style="color:windowtext"> 
                  Felix</span></p>
              <p class="MsoNormal"><span style="color:windowtext"> </span></p>
              <div>
                <div style="border:none; border-top:solid #E1E1E1 1.0pt;
                  padding:3.0pt 0in 0in 0in">
                  <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span
                      style="color:windowtext"> Koenig, Christian
                      <br>
                      <b>Sent:</b> Thursday, September 27, 2018 10:30 AM<br>
                      <b>To:</b> Kuehling, Felix <a
                        class="moz-txt-link-rfc2396E"
                        href="mailto:Felix.Kuehling@amd.com"
                        moz-do-not-send="true">
                        <Felix.Kuehling@amd.com></a><br>
                      <b>Cc:</b> <a class="moz-txt-link-abbreviated"
                        href="mailto:j.glisse@gmail.com"
                        moz-do-not-send="true">j.glisse@gmail.com</a>;
                      Yang, Philip
                      <a class="moz-txt-link-rfc2396E"
                        href="mailto:Philip.Yang@amd.com"
                        moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                      <a class="moz-txt-link-abbreviated"
                        href="mailto:amd-gfx@lists.freedesktop.org"
                        moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
                      <b>Subject:</b> RE: [PATCH] drm/amdgpu: use HMM
                      mirror callback to replace mmu notifier v4</span></p>
                </div>
              </div>
              <p class="MsoNormal"> </p>
              <div>
                <p class="MsoNormal"><span style="color:windowtext">At
                    least with get_user_pages() that is perfectly
                    possible.
                  </span></p>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                </div>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext">For
                      HMM it could be that this is prevented somehow.</span></p>
                </div>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                </div>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext">Christian.</span></p>
                </div>
              </div>
              <div>
                <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext">Am
                      27.09.2018 16:27 schrieb "Kuehling, Felix" <<a
                        href="mailto:Felix.Kuehling@amd.com"
                        moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:</span></p>
                </div>
              </div>
              <div>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext">>
                      In this case you can end up accessing pages which
                      are invalidated while get_user_pages is in
                      process.</span></p>
                  <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                  <p class="MsoNormal"><span style="color:windowtext">What’s
                      the sequence of events you have in mind? Something
                      like this?</span></p>
                  <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                  <ul style="margin-top:0in" type="disc">
                    <li class="MsoListParagraph"
                      style="color:windowtext; margin-left:0in">Page
                      table is updated and triggers MMU notifier
                    </li>
                    <li class="MsoListParagraph"
                      style="color:windowtext; margin-left:0in">amdgpu
                      MMU notifier runs and waits for pending CS to
                      finish while holding the read lock
                    </li>
                    <li class="MsoListParagraph"
                      style="color:windowtext; margin-left:0in">New CS
                      starts just after invalidate_range_start MMU
                      notifier finishes but before the page table update
                      is done
                    </li>
                    <li class="MsoListParagraph"
                      style="color:windowtext; margin-left:0in">get_user_pages
                      returns outdated physical addresses
                    </li>
                  </ul>
                  <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                  <p class="MsoNormal"><span style="color:windowtext">I
                      hope that’s not actually possible and that
                      get_user_pages or hmm_vma_fault would block until
                      the page table update is done. That is,
                      invalidate_range_start marks the start of a page
                      table update, and while that update is in
                      progress, get_user_pages or hmm_vma_fault block.
                      Jerome, can you comment on that?</span></p>
                  <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                  <p class="MsoNormal"><span style="color:windowtext">Thanks,<br>
                        Felix</span></p>
                  <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                  <div>
                    <div style="border:none; border-top:solid #E1E1E1
                      1.0pt; padding:3.0pt 0in 0in 0in">
                      <p class="MsoNormal"><b><span
                            style="color:windowtext">From:</span></b><span
                          style="color:windowtext"> Koenig, Christian
                          <br>
                          <b>Sent:</b> Thursday, September 27, 2018 9:59
                          AM<br>
                          <b>To:</b> Kuehling, Felix <<a
                            href="mailto:Felix.Kuehling@amd.com"
                            moz-do-not-send="true">Felix.Kuehling@amd.com</a>><br>
                          <b>Cc:</b> Yang, Philip <<a
                            href="mailto:Philip.Yang@amd.com"
                            moz-do-not-send="true">Philip.Yang@amd.com</a>>;
                          <a href="mailto:amd-gfx@lists.freedesktop.org"
                            moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                          Jerome Glisse <<a
                            href="mailto:j.glisse@gmail.com"
                            moz-do-not-send="true">j.glisse@gmail.com</a>><br>
                          <b>Subject:</b> RE: [PATCH] drm/amdgpu: use
                          HMM mirror callback to replace mmu notifier v4</span></p>
                    </div>
                  </div>
                  <p class="MsoNormal"> </p>
                  <div>
                    <p class="MsoNormal"><span style="color:windowtext">Yeah
                        I understand that, but again that won't work.
                      </span></p>
                    <div>
                      <p class="MsoNormal"><span
                          style="color:windowtext"> </span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span
                          style="color:windowtext">In this case you can
                          end up accessing pages which are invalidated
                          while get_user_pages is in process.</span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span
                          style="color:windowtext"> </span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span
                          style="color:windowtext">Christian.</span></p>
                    </div>
                  </div>
                  <div>
                    <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                    <div>
                      <p class="MsoNormal"><span
                          style="color:windowtext">Am 27.09.2018 15:41
                          schrieb "Kuehling, Felix" <<a
                            href="mailto:Felix.Kuehling@amd.com"
                            moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:</span></p>
                    </div>
                  </div>
                  <div>
                    <div>
                      <p class="MsoNormal"><span
                          style="color:windowtext">> I’m not planning
                          to change that. I don’t think there is any
                          need to change it.</span></p>
                      <p class="MsoNormal">><br>
                        > Yeah, but when HMM doesn't provide both the
                        start and the end hock of the invalidation this
                        way won't work any more.<br>
                        ><br>
                        > So we need to find a solution for this,<br>
                        > Christian.</p>
                      <p class="MsoNormal"> </p>
                      <p class="MsoNormal">My whole argument is that you
                        don’t need to hold the read lock until the
                        invalidate_range_end. Just read_lock and
                        read_unlock in the invalidate_range_start
                        function.</p>
                      <p class="MsoNormal"> </p>
                      <p class="MsoNormal">Regards,</p>
                      <p class="MsoNormal" style="margin-bottom:12.0pt"> 
                        Felix</p>
                      <p class="MsoNormal"><span
                          style="color:windowtext"> </span></p>
                      <div>
                        <div style="border:none; border-top:solid
                          #E1E1E1 1.0pt; padding:3.0pt 0in 0in 0in">
                          <p class="MsoNormal"><b><span
                                style="color:windowtext">From:</span></b><span
                              style="color:windowtext"> Koenig,
                              Christian
                              <br>
                              <b>Sent:</b> Thursday, September 27, 2018
                              9:22 AM<br>
                              <b>To:</b> Kuehling, Felix <<a
                                href="mailto:Felix.Kuehling@amd.com"
                                moz-do-not-send="true">Felix.Kuehling@amd.com</a>><br>
                              <b>Cc:</b> Yang, Philip <<a
                                href="mailto:Philip.Yang@amd.com"
                                moz-do-not-send="true">Philip.Yang@amd.com</a>>;
                              <a
                                href="mailto:amd-gfx@lists.freedesktop.org"
                                moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                              Jerome Glisse <<a
                                href="mailto:j.glisse@gmail.com"
                                moz-do-not-send="true">j.glisse@gmail.com</a>><br>
                              <b>Subject:</b> Re: [PATCH] drm/amdgpu:
                              use HMM mirror callback to replace mmu
                              notifier v4</span></p>
                        </div>
                      </div>
                      <p class="MsoNormal"> </p>
                      <div>
                        <p class="MsoNormal">Am 27.09.2018 um 15:18
                          schrieb Kuehling, Felix:</p>
                      </div>
                      <blockquote style="margin-top:5.0pt;
                        margin-bottom:5.0pt">
                        <p class="MsoNormal">> The problem is here:<br>
                          ></p>
                        <p class="MsoNormal">>
                          ttm_eu_fence_buffer_objects(&p->ticket,
                          &p->validated, p->fence);</p>
                        <p class="MsoNormal">>
                          amdgpu_mn_unlock(p->mn); </p>
                        <p class="MsoNormal"
                          style="margin-bottom:12.0pt">><br>
                          > We need to hold the lock until the fence
                          is added to the reservation object.<br>
                          ><br>
                          > Otherwise somebody could have changed the
                          page tables just in the moment between the
                          check of amdgpu_ttm_tt_userptr_needs_pages()
                          and adding the fence to the reservation
                          object.</p>
                        <p class="MsoNormal"><span
                            style="color:windowtext">I’m not planning to
                            change that. I don’t think there is any need
                            to change it.</span></p>
                      </blockquote>
                      <p class="MsoNormal" style="margin-bottom:12.0pt"><br>
                        Yeah, but when HMM doesn't provide both the
                        start and the end hock of the invalidation this
                        way won't work any more.<br>
                        <br>
                        So we need to find a solution for this,<br>
                        Christian.</p>
                      <blockquote style="margin-top:5.0pt;
                        margin-bottom:5.0pt">
                        <p class="MsoNormal"><span
                            style="color:windowtext"> </span></p>
                        <p class="MsoNormal"><span
                            style="color:windowtext">Regards,</span></p>
                        <p class="MsoNormal"><span
                            style="color:windowtext">  Felix</span></p>
                        <p class="MsoNormal"><span
                            style="color:windowtext"> </span></p>
                        <div>
                          <div style="border:none; border-top:solid
                            #E1E1E1 1.0pt; padding:3.0pt 0in 0in 0in">
                            <p class="MsoNormal"><b><span
                                  style="color:windowtext">From:</span></b><span
                                style="color:windowtext"> Koenig,
                                Christian
                                <br>
                                <b>Sent:</b> Thursday, September 27,
                                2018 7:24 AM<br>
                                <b>To:</b> Kuehling, Felix <a
                                  href="mailto:Felix.Kuehling@amd.com"
                                  moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                                <b>Cc:</b> Yang, Philip <a
                                  href="mailto:Philip.Yang@amd.com"
                                  moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                                <a
                                  href="mailto:amd-gfx@lists.freedesktop.org"
                                  moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                                Jerome Glisse
                                <a href="mailto:j.glisse@gmail.com"
                                  moz-do-not-send="true"><j.glisse@gmail.com></a><br>
                                <b>Subject:</b> Re: [PATCH] drm/amdgpu:
                                use HMM mirror callback to replace mmu
                                notifier v4</span></p>
                          </div>
                        </div>
                        <p class="MsoNormal"> </p>
                        <div>
                          <p class="MsoNormal">Am 27.09.2018 um 13:08
                            schrieb Kuehling, Felix:</p>
                        </div>
                        <blockquote style="margin-top:5.0pt;
                          margin-bottom:5.0pt">
                          <p class="MsoNormal">> We double check that
                            there wasn't any page table modification
                            while we prepared the submission and restart
                            the whole process when there actually was
                            some update.<br>
                            ><br>
                            > The reason why we need to do this is
                            here:<br>
                            ></p>
                          <p class="MsoNormal">>       
                            ttm_eu_fence_buffer_objects(&p->ticket,
                            &p->validated, p->fence);<br>
                            >        amdgpu_mn_unlock(p->mn);</p>
                          <p class="MsoNormal"
                            style="margin-bottom:12.0pt">><br>
                            > Only after the new fence is added to
                            the buffer object we can release the lock so
                            that any invalidation will now block on our
                            command submission to finish before it
                            modifies the page table.<br>
                            <br>
                          </p>
                          <p class="MsoNormal"><span
                              style="color:windowtext">I don’t see why
                              this requires holding the read-lock until
                              invalidate_range_end.
                              amdgpu_ttm_tt_affect_userptr gets called
                              while the mn read-lock is held in
                              invalidate_range_start notifier.</span></p>
                        </blockquote>
                        <p class="MsoNormal"
                          style="margin-bottom:12.0pt"><br>
                          That's not related to
                          amdgpu_ttm_tt_affect_userptr(), this function
                          could actually be called outside the lock.<br>
                          <br>
                          The problem is here:</p>
                        <blockquote style="margin-top:5.0pt;
                          margin-bottom:5.0pt">
                          <p class="MsoNormal">ttm_eu_fence_buffer_objects(&p->ticket,
                            &p->validated, p->fence);</p>
                        </blockquote>
                        <blockquote style="margin-top:5.0pt;
                          margin-bottom:5.0pt">
                          <p class="MsoNormal">amdgpu_mn_unlock(p->mn);
                          </p>
                        </blockquote>
                        <p class="MsoNormal"
                          style="margin-bottom:12.0pt"><br>
                          We need to hold the lock until the fence is
                          added to the reservation object.<br>
                          <br>
                          Otherwise somebody could have changed the page
                          tables just in the moment between the check of
                          amdgpu_ttm_tt_userptr_needs_pages() and adding
                          the fence to the reservation object.<br>
                          <br>
                          Regards,<br>
                          Christian.<br>
                          <br>
                          <br>
                        </p>
                        <blockquote style="margin-top:5.0pt;
                          margin-bottom:5.0pt">
                          <p class="MsoNormal"><span
                              style="color:windowtext"> </span></p>
                          <p class="MsoNormal"><span
                              style="color:windowtext">Regards,</span></p>
                          <p class="MsoNormal"><span
                              style="color:windowtext">  Felix</span></p>
                          <p class="MsoNormal"><span
                              style="color:windowtext"> </span></p>
                          <div>
                            <div style="border:none; border-top:solid
                              #E1E1E1 1.0pt; padding:3.0pt 0in 0in 0in">
                              <p class="MsoNormal"><b><span
                                    style="color:windowtext">From:</span></b><span
                                  style="color:windowtext"> Koenig,
                                  Christian
                                  <br>
                                  <b>Sent:</b> Thursday, September 27,
                                  2018 5:27 AM<br>
                                  <b>To:</b> Kuehling, Felix <a
                                    href="mailto:Felix.Kuehling@amd.com"
                                    moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                                  <b>Cc:</b> Yang, Philip <a
                                    href="mailto:Philip.Yang@amd.com"
                                    moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                                  <a
                                    href="mailto:amd-gfx@lists.freedesktop.org"
                                    moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                                  Jerome Glisse
                                  <a href="mailto:j.glisse@gmail.com"
                                    moz-do-not-send="true"><j.glisse@gmail.com></a><br>
                                  <b>Subject:</b> Re: [PATCH]
                                  drm/amdgpu: use HMM mirror callback to
                                  replace mmu notifier v4</span></p>
                            </div>
                          </div>
                          <p class="MsoNormal"> </p>
                          <div>
                            <p class="MsoNormal"
                              style="margin-bottom:12.0pt">That is
                              correct, but take a look what we do when
                              after calling the amdgpu_mn_read_lock():<br>
                              <br>
                              <br>
                            </p>
                            <blockquote style="margin-top:5.0pt;
                              margin-bottom:5.0pt">
                              <p class="MsoNormal">        /* No memory
                                allocation is allowed while holding the
                                mn lock */<br>
                                        amdgpu_mn_lock(p->mn);<br>
                                       
                                amdgpu_bo_list_for_each_userptr_entry(e,
                                p->bo_list) {<br>
                                                struct amdgpu_bo *bo =
                                ttm_to_amdgpu_bo(e->tv.bo);<br>
                                <br>
                                                if
                                (amdgpu_ttm_tt_userptr_needs_pages(bo->tbo.ttm))
                                {<br>
                                                        r =
                                -ERESTARTSYS;<br>
                                                        goto
                                error_abort;<br>
                                                }<br>
                                        }</p>
                            </blockquote>
                            <p class="MsoNormal"
                              style="margin-bottom:12.0pt"><br>
                              We double check that there wasn't any page
                              table modification while we prepared the
                              submission and restart the whole process
                              when there actually was some update.<br>
                              <br>
                              The reason why we need to do this is here:<br>
                              <br>
                            </p>
                            <blockquote style="margin-top:5.0pt;
                              margin-bottom:5.0pt">
                              <p class="MsoNormal">       
                                ttm_eu_fence_buffer_objects(&p->ticket,
                                &p->validated, p->fence);<br>
                                        amdgpu_mn_unlock(p->mn);</p>
                            </blockquote>
                            <p class="MsoNormal"><br>
                              Only after the new fence is added to the
                              buffer object we can release the lock so
                              that any invalidation will now block on
                              our command submission to finish before it
                              modifies the page table.<br>
                              <br>
                              The only other option would be to add the
                              fence first and then check if there was
                              any update to the page tables.<br>
                              <br>
                              The issue with that approach is that
                              adding a fence can't be made undone, so if
                              we find that there actually was an update
                              to the page tables we would need to
                              somehow turn the CS into a dummy (e.g.
                              overwrite all IBs with NOPs or something
                              like that) and still submit it.<br>
                              <br>
                              Not sure if that is actually possible.<br>
                              <br>
                              Regards,<br>
                              Christian.<br>
                              <br>
                              Am 27.09.2018 um 10:47 schrieb Kuehling,
                              Felix:</p>
                          </div>
                          <blockquote style="margin-top:5.0pt;
                            margin-bottom:5.0pt">
                            <p class="MsoNormal">So back to my previous
                              question:</p>
                            <p class="MsoPlainText"> </p>
                            <p class="MsoPlainText">>> But do we
                              really need another lock for this?
                              Wouldn't the
                            </p>
                            <p class="MsoPlainText">>>
                              re-validation of userptr BOs (currently
                              calling get_user_pages) force
                            </p>
                            <p class="MsoPlainText">>>
                              synchronization with the ongoing page
                              table invalidation through the
                            </p>
                            <p class="MsoPlainText">>> mmap_sem or
                              other MM locks?</p>
                            <p class="MsoPlainText">> </p>
                            <p class="MsoPlainText">> No and yes. We
                              don't hold any other locks while doing
                              command submission, but I expect that HMM
                              has its own mechanism to prevent that.</p>
                            <p class="MsoPlainText">> </p>
                            <p class="MsoPlainText">> Since we don't
                              modify amdgpu_mn_lock()/amdgpu_mn_unlock()
                              we are certainly not using this mechanism
                              correctly.</p>
                            <p class="MsoNormal"> </p>
                            <p class="MsoNormal">The existing
                              amdgpu_mn_lock/unlock should block the MMU
                              notifier while a command submission is in
                              progress. It should also block command
                              submission while an MMU notifier is in
                              progress.</p>
                            <p class="MsoNormal"> </p>
                            <p class="MsoNormal">What we lose with HMM
                              is the ability to hold a read-lock for the
                              entire duration of the
                              invalidate_range_start until
                              invalidate_range_end. As I understand it,
                              that lock is meant to prevent new command
                              submissions while the page tables are
                              being updated by the kernel. But my point
                              is, that get_user_pages or hmm_vma_fault
                              should do the same kind of thing. Before
                              the command submission can go ahead, it
                              needs to update the userptr addresses. If
                              the page tables are still being updated,
                              it will block there even without holding
                              the amdgpu_mn_read_lock.</p>
                            <p class="MsoNormal"> </p>
                            <p class="MsoNormal">Regards,</p>
                            <p class="MsoNormal">  Felix</p>
                            <p class="MsoNormal"> </p>
                            <div>
                              <div style="border:none; border-top:solid
                                #E1E1E1 1.0pt; padding:3.0pt 0in 0in
                                0in">
                                <p class="MsoNormal"><b>From:</b>
                                  Koenig, Christian <br>
                                  <b>Sent:</b> Thursday, September 27,
                                  2018 3:00 AM<br>
                                  <b>To:</b> Kuehling, Felix <a
                                    href="mailto:Felix.Kuehling@amd.com"
                                    moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                                  <b>Cc:</b> Yang, Philip <a
                                    href="mailto:Philip.Yang@amd.com"
                                    moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                                  <a
                                    href="mailto:amd-gfx@lists.freedesktop.org"
                                    moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                                  Jerome Glisse
                                  <a href="mailto:j.glisse@gmail.com"
                                    moz-do-not-send="true"><j.glisse@gmail.com></a><br>
                                  <b>Subject:</b> RE: [PATCH]
                                  drm/amdgpu: use HMM mirror callback to
                                  replace mmu notifier v4</p>
                              </div>
                            </div>
                            <p class="MsoNormal"> </p>
                            <div>
                              <div>
                                <p class="MsoNormal">No, that won't
                                  work. We would still run into lock
                                  inversion problems.</p>
                                <div>
                                  <p class="MsoNormal"> </p>
                                </div>
                                <div>
                                  <p class="MsoNormal">What we could do
                                    with the scheduler is to turn
                                    submissions into dummies if we find
                                    that the page tables are now
                                    outdated.</p>
                                </div>
                                <div>
                                  <p class="MsoNormal"> </p>
                                </div>
                                <div>
                                  <p class="MsoNormal">But that would be
                                    really hacky and I'm not sure if
                                    that would really work in all cases.</p>
                                </div>
                                <div>
                                  <p class="MsoNormal"> </p>
                                </div>
                                <div>
                                  <p class="MsoNormal">Christian.</p>
                                </div>
                              </div>
                              <div>
                                <p class="MsoNormal"> </p>
                                <div>
                                  <p class="MsoNormal">Am 27.09.2018
                                    08:53 schrieb "Kuehling, Felix" <<a
href="mailto:Felix.Kuehling@amd.com" moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:</p>
                                </div>
                              </div>
                            </div>
                            <div>
                              <p class="MsoNormal">I had a chat with
                                Jerome yesterday. He pointed out that
                                the new blockable parameter can be used
                                to infer whether the MMU notifier is
                                being called  in a reclaim operation. So
                                if blockable==true, it should even be
                                safe to take the BO reservation lock
                                without problems. I think with that we
                                should be able to remove the read-write
                                locking completely and go back to
                                locking (or try-locking for
                                blockable==false) the reservation locks
                                in the MMU notifier?<br>
                                <br>
                                Regards,<br>
                                  Felix<br>
                                <br>
                                -----Original Message-----<br>
                                From: amd-gfx <<a
                                  href="mailto:amd-gfx-bounces@lists.freedesktop.org"
                                  moz-do-not-send="true">amd-gfx-bounces@lists.freedesktop.org</a>>
                                On Behalf Of Christian König<br>
                                Sent: Saturday, September 15, 2018 3:47
                                AM<br>
                                To: Kuehling, Felix <<a
                                  href="mailto:Felix.Kuehling@amd.com"
                                  moz-do-not-send="true">Felix.Kuehling@amd.com</a>>;
                                Yang, Philip <<a
                                  href="mailto:Philip.Yang@amd.com"
                                  moz-do-not-send="true">Philip.Yang@amd.com</a>>;
                                <a
                                  href="mailto:amd-gfx@lists.freedesktop.org"
                                  moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                                Jerome Glisse <<a
                                  href="mailto:j.glisse@gmail.com"
                                  moz-do-not-send="true">j.glisse@gmail.com</a>><br>
                                Subject: Re: [PATCH] drm/amdgpu: use HMM
                                mirror callback to replace mmu notifier
                                v4<br>
                                <br>
                                Am 14.09.2018 um 22:21 schrieb Felix
                                Kuehling:<br>
                                > On 2018-09-14 01:52 PM, Christian
                                König wrote:<br>
                                >> Am 14.09.2018 um 19:47 schrieb
                                Philip Yang:<br>
                                >>> On 2018-09-14 03:51 AM,
                                Christian König wrote:<br>
                                >>>> Am 13.09.2018 um 23:51
                                schrieb Felix Kuehling:<br>
                                >>>>> On 2018-09-13 04:52
                                PM, Philip Yang wrote:<br>
                                >>>>> [SNIP]<br>
                                >>>>>>    +   
                                amdgpu_mn_read_unlock(amn);<br>
                                >>>>>> +<br>
                                >>>>>
                                amdgpu_mn_read_lock/unlock support
                                recursive locking for multiple <br>
                                >>>>> overlapping or
                                nested invalidation ranges. But if you'r
                                locking <br>
                                >>>>> and unlocking in
                                the same function. Is that still a
                                concern?<br>
                                >>> I don't understand the
                                possible recursive case, but<br>
                                >>> amdgpu_mn_read_lock() still
                                support recursive locking.<br>
                                >>>> Well the real problem
                                is that unlocking them here won't work.<br>
                                >>>><br>
                                >>>> We need to hold the
                                lock until we are sure that the
                                operation which <br>
                                >>>> updates the page tables
                                is completed.<br>
                                >>>><br>
                                >>> The reason for this change
                                is because hmm mirror has <br>
                                >>> invalidate_start callback,
                                no invalidate_end callback<br>
                                >>><br>
                                >>> Check mmu_notifier.c and
                                hmm.c again, below is entire logic to <br>
                                >>> update CPU page tables and
                                callback:<br>
                                >>><br>
                                >>> mn lock amn->lock is
                                used to protect interval tree access
                                because <br>
                                >>> user may submit/register
                                new userptr anytime.<br>
                                >>> This is same for old and
                                new way.<br>
                                >>><br>
                                >>> step 2 guarantee the GPU
                                operation is done before updating CPU
                                page <br>
                                >>> table.<br>
                                >>><br>
                                >>> So I think the change is
                                safe. We don't need hold mn lock until
                                the <br>
                                >>> CPU page tables update is
                                completed.<br>
                                >> No, that isn't even remotely
                                correct. The lock doesn't protects the <br>
                                >> interval tree.<br>
                                >><br>
                                >>> Old:<br>
                                >>>     1.
                                down_read_non_owner(&amn->lock)<br>
                                >>>     2. loop to handle BOs
                                from node->bos through interval tree<br>
                                >>> amn->object nodes<br>
                                >>>         gfx: wait for
                                pending BOs fence operation done, mark
                                user <br>
                                >>> pages dirty<br>
                                >>>         kfd: evict user
                                queues of the process, wait for queue <br>
                                >>> unmap/map operation done<br>
                                >>>     3. update CPU page
                                tables<br>
                                >>>     4.
                                up_read(&amn->lock)<br>
                                >>><br>
                                >>> New, switch step 3 and 4<br>
                                >>>     1.
                                down_read_non_owner(&amn->lock)<br>
                                >>>     2. loop to handle BOs
                                from node->bos through interval tree<br>
                                >>> amn->object nodes<br>
                                >>>         gfx: wait for
                                pending BOs fence operation done, mark
                                user <br>
                                >>> pages dirty<br>
                                >>>         kfd: evict user
                                queues of the process, wait for queue <br>
                                >>> unmap/map operation done<br>
                                >>>     3.
                                up_read(&amn->lock)<br>
                                >>>     4. update CPU page
                                tables<br>
                                >> The lock is there to make sure
                                that we serialize page table updates <br>
                                >> with command submission.<br>
                                > As I understand it, the idea is to
                                prevent command submission (adding <br>
                                > new fences to BOs) while a page
                                table invalidation is in progress.<br>
                                <br>
                                Yes, exactly.<br>
                                <br>
                                > But do we really need another lock
                                for this? Wouldn't the <br>
                                > re-validation of userptr BOs
                                (currently calling get_user_pages) force
                                <br>
                                > synchronization with the ongoing
                                page table invalidation through the <br>
                                > mmap_sem or other MM locks?<br>
                                <br>
                                No and yes. We don't hold any other
                                locks while doing command submission,
                                but I expect that HMM has its own
                                mechanism to prevent that.<br>
                                <br>
                                Since we don't modify
                                amdgpu_mn_lock()/amdgpu_mn_unlock() we
                                are certainly not using this mechanism
                                correctly.<br>
                                <br>
                                Regards,<br>
                                Christian.<br>
_______________________________________________<br>
                                amd-gfx mailing list<br>
                                <a
                                  href="mailto:amd-gfx@lists.freedesktop.org"
                                  moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
                                <a
                                  href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx"
                                  moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a></p>
                            </div>
                          </blockquote>
                          <p class="MsoNormal"> </p>
                        </blockquote>
                        <p class="MsoNormal"> </p>
                      </blockquote>
                      <p class="MsoNormal"> </p>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </blockquote>
          <br>
        </blockquote>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>