<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Hi Christian,<br>
      <br>
      Yes, I agree. I am working on patch 2 to replace get_user_page
      with HMM. One problem is in current gfx path, we check if
      mmu_invalidation multiple times in amdgpu_cs_ioctl() path after
      get_user_page(), amdgpu_cs_parser_bos(),
      amdgpu_cs_list_validate(), and amdgpu_cs_submit(). For HMM,
      hmm_vma_range_done() has to be called once and only once after
      hmm_vma_get_pfns()/hmm_vma_fault(), so I will call
      hmm_vma_range_done() inside amdgpu_cs_submit after holding the mn
      lock. Is my understanding correct?<br>
      <br>
      Philip<br>
      <br>
      On 2018-10-02 11:05 AM, Christian König wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:09916f9a-3f5f-27ab-01e6-6d77303cf052@gmail.com">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <div class="moz-cite-prefix">Checking more code and documentation
        and thinking about it over my vacation I think I have some new
        conclusions here.<br>
        <br>
        Currently we are using get_user_pages() together with an MMU
        notifier to guarantee coherent address space view, because
        get_user_pages() works by grabbing a reference to the pages and
        ignoring concurrent page table updates.<br>
        <br>
        But HMM uses a different approach by checking the address space
        for modifications using hmm_vma_range_done() and re-trying when
        the address space has changed.<br>
        <br>
        Now what you are trying to do is to change that into
        get_user_pages() and HMM callback and this is what won't work.
        We can either use get_user_pages() with MMU notifier or we can
        use HMM for the work, but we can't mix and match.<br>
        <br>
        So my initial guess was correct that we just need to change both
        sides of the implementation at the same time.<br>
        <br>
        Regards,<br>
        Christian.<br>
        <br>
        Am 28.09.2018 um 17:13 schrieb Koenig, Christian:<br>
      </div>
      <blockquote type="cite"
        cite="mid:b8686e6b-0c3e-4feb-afbd-80397aac31a0@email.android.com">
        <meta content="text/html; charset=Windows-1252">
        <div dir="auto">No it definitely isn't.
          <div dir="auto"><br>
          </div>
          <div dir="auto">We have literally worked month on this with
            the core MM developers.</div>
          <div dir="auto"><br>
          </div>
          <div dir="auto">Making sure that we have a consistent page
            array is absolutely vital for correct operation.</div>
          <div dir="auto"><br>
          </div>
          <div dir="auto">Please also check Jerome's presentation from
            XDC it also perfectly explains why this approach won't work
            correctly.</div>
          <div dir="auto"><br>
          </div>
          <div dir="auto">Christian.</div>
        </div>
        <div class="gmail_extra"><br>
          <div class="gmail_quote">Am 28.09.2018 17:07 schrieb "Yang,
            Philip" <a class="moz-txt-link-rfc2396E"
              href="mailto:Philip.Yang@amd.com" moz-do-not-send="true"><Philip.Yang@amd.com></a>:<br
              type="attribution">
          </div>
        </div>
        <div>
          <div class="moz-cite-prefix">For B path, we take
            mm->mmap_sem, then call hmm_vma_get_pfns() or
            get_user_pages(). This is obvious.<br>
            <br>
            For A path, mmu notifier
mmu_notifier_invalidate_range_start()/mmu_notifier_invalidate_range_end()
            is called in many places, and the calling path is quit
            complicated inside mm, it's not obvious. I checked many of
            the them, for example:<br>
            <br>
            do_munmap()   <br>
              down_write(&mm->mmap_sem)<br>
              arch_unmap()<br>
                mpx_notify_unmap()...<br>
                   zap_bt_entries_mapping()<br>
                     zap_page_range()<br>
             up_write(&mm->mmap_sem)<br>
            <br>
            void zap_page_range(struct vm_area_struct *vma, unsigned
            long start,<br>
                    unsigned long size)<br>
            {<br>
                struct mm_struct *mm = vma->vm_mm;<br>
                struct mmu_gather tlb;<br>
                unsigned long end = start + size;<br>
            <br>
                lru_add_drain();<br>
                tlb_gather_mmu(&tlb, mm, start, end);<br>
                update_hiwater_rss(mm);<br>
                mmu_notifier_invalidate_range_start(mm, start, end);<br>
                for ( ; vma && vma->vm_start < end; vma =
            vma->vm_next)<br>
                    unmap_single_vma(&tlb, vma, start, end, NULL);<br>
                mmu_notifier_invalidate_range_end(mm, start, end);<br>
                tlb_finish_mmu(&tlb, start, end);<br>
            }<br>
             <br>
            So AFAIK it's okay without invalidate_range_end() callback.
            <br>
                     <br>
            Regards,<br>
            Philip<br>
            <br>
            On 2018-09-28 01:25 AM, Koenig, Christian wrote:<br>
          </div>
          <blockquote type="cite">
            <meta content="text/html; charset=utf-8">
            <div dir="auto">No, that is incorrect as well :)
              <div dir="auto"><br>
              </div>
              <div dir="auto">The mmap_sem isn't necessary taken during
                page table updates.</div>
              <div dir="auto"><br>
              </div>
              <div dir="auto">What you could do is replace
                get_user_pages() directly with HMM. If I'm not
                completely mistaken that should work as expected.</div>
              <div dir="auto"><br>
              </div>
              <div dir="auto">Christian.</div>
            </div>
            <div class="gmail_extra"><br>
              <div class="gmail_quote">Am 27.09.2018 22:18 schrieb
                "Yang, Philip" <a class="moz-txt-link-rfc2396E"
                  href="mailto:Philip.Yang@amd.com"
                  moz-do-not-send="true"> <Philip.Yang@amd.com></a>:<br
                  type="attribution">
              </div>
            </div>
            <div>
              <div class="moz-cite-prefix">I was trying to understand
                the way how HMM handle this concurrent issue and how we
                handle it in amdgpu_ttm_tt_userptr_needs_pages() and 
                amdgpu_ttm_tt_affect_userptr(). HMM uses range->valid
                flag, we use gtt->mmu_invalidations and
                gtt->last_set_pages. Both use the same lock plus flag
                idea actually.<br>
                <br>
                Thanks for the information, now I understand fence
                ttm_eu_fence_buffer_objects() put to BOs will block CPU
                page table update. This is another side of this
                concurrent issue I didn't know.<br>
                <br>
                I had same worry that it has issue without
                invalidate_range_end() callback as the calling sequence
                Felix lists. Now I think it's fine after taking a look
                again today because of mm->mmap_sem usage, this is my
                understanding:<br>
                <br>
                A path:<br>
                <br>
                down_write(&mm->mmap_sem);<br>
                mmu_notifier_invalidate_range_start()<br>
                    take_lock()<br>
                    release_lock()<br>
                CPU page table update<br>
                mmu_notifier_invalidate_range_end()<br>
                up_write(&mm->mmap_sem);<br>
                <br>
                B path:<br>
                <br>
                again: <br>
                down_read(&mm->mmap_sem);<br>
                hmm_vma_get_pfns()<br>
                up_read(&mm->mmap_sem);<br>
                ....<br>
                ....<br>
                take_lock()<br>
                if (!hmm_vma_range_done()) {<br>
                   release_lock()<br>
                   goto again<br>
                }<br>
                submit command job...<br>
                release_lock()<br>
                <br>
                If you agree, I will submit patch v5 with some minor
                changes, and submit another patch to replace
                get_user_page() with HMM.<br>
                 <br>
                Regards,<br>
                Philip  <br>
                <br>
                On 2018-09-27 11:36 AM, Christian König wrote:<br>
              </div>
              <blockquote type="cite">
                <div class="moz-cite-prefix">Yeah, I've read that as
                  well.<br>
                  <br>
                  My best guess is that we just need to add a call to
                  hmm_vma_range_done() after taking the lock and also
                  replace get_user_pages() with hmm_vma_get_pfns().<br>
                  <br>
                  But I'm still not 100% sure how all of that is
                  supposed to work together.<br>
                  <br>
                  Regards,<br>
                  Christian.<br>
                  <br>
                  Am 27.09.2018 um 16:50 schrieb Kuehling, Felix:<br>
                </div>
                <blockquote type="cite">
                  <meta name="Generator" content="Microsoft Word 15
                    (filtered medium)">
                  <style>
<!--
@font-face
        {font-family:"Cambria Math"}
@font-face
        {font-family:Calibri}
@font-face
        {font-family:Consolas}
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
a:link, span.MsoHyperlink
        {color:blue;
        text-decoration:underline}
a:visited, span.MsoHyperlinkFollowed
        {color:purple;
        text-decoration:underline}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {margin-top:0in;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:.5in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
p.msonormal0, li.msonormal0, div.msonormal0
        {margin-right:0in;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif}
span.PlainTextChar
        {font-family:Consolas;
        color:black}
p.msonormal00, li.msonormal00, div.msonormal00
        {margin-right:0in;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif}
p.msonormal000, li.msonormal000, div.msonormal000
        {margin-right:0in;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
p.emailquote, li.emailquote, div.emailquote
        {margin-right:0in;
        margin-left:1.0pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
p.msochpdefault, li.msochpdefault, div.msochpdefault
        {margin-right:0in;
        margin-left:0in;
        font-size:10.0pt;
        font-family:"Calibri",sans-serif}
span.plaintextchar0
        {font-family:Consolas;
        color:black}
span.plaintextchar00
        {font-family:"Calibri",sans-serif}
span.emailstyle22
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.emailstyle23
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.emailstyle24
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.emailstyle25
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.emailstyle30
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.EmailStyle33
        {font-family:"Calibri",sans-serif;
        color:windowtext}
.MsoChpDefault
        {font-size:10.0pt}
@page WordSection1
        {margin:1.0in 1.0in 1.0in 1.0in}
ol
        {margin-bottom:0in}
ul
        {margin-bottom:0in}
-->
</style>
                  <div class="WordSection1">
                    <p class="MsoNormal"><span style="color:windowtext">I
                        think the answer is here: <a
href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216"
                          moz-do-not-send="true">
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216</a></span></p>
                    <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                    <p class="MsoNormal"><span style="color:windowtext">Regards,</span></p>
                    <p class="MsoNormal"><span style="color:windowtext"> 
                        Felix</span></p>
                    <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                    <div>
                      <div style="border:none; border-top:solid #E1E1E1
                        1.0pt; padding:3.0pt 0in 0in 0in">
                        <p class="MsoNormal"><b><span
                              style="color:windowtext">From:</span></b><span
                            style="color:windowtext"> Koenig, Christian
                            <br>
                            <b>Sent:</b> Thursday, September 27, 2018
                            10:30 AM<br>
                            <b>To:</b> Kuehling, Felix <a
                              class="moz-txt-link-rfc2396E"
                              href="mailto:Felix.Kuehling@amd.com"
                              moz-do-not-send="true">
                              <Felix.Kuehling@amd.com></a><br>
                            <b>Cc:</b> <a
                              class="moz-txt-link-abbreviated"
                              href="mailto:j.glisse@gmail.com"
                              moz-do-not-send="true">j.glisse@gmail.com</a>;
                            Yang, Philip <a
                              class="moz-txt-link-rfc2396E"
                              href="mailto:Philip.Yang@amd.com"
                              moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                            <a class="moz-txt-link-abbreviated"
                              href="mailto:amd-gfx@lists.freedesktop.org"
                              moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
                            <b>Subject:</b> RE: [PATCH] drm/amdgpu: use
                            HMM mirror callback to replace mmu notifier
                            v4</span></p>
                      </div>
                    </div>
                    <p class="MsoNormal"> </p>
                    <div>
                      <p class="MsoNormal"><span
                          style="color:windowtext">At least with
                          get_user_pages() that is perfectly possible. </span></p>
                      <div>
                        <p class="MsoNormal"><span
                            style="color:windowtext"> </span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span
                            style="color:windowtext">For HMM it could be
                            that this is prevented somehow.</span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span
                            style="color:windowtext"> </span></p>
                      </div>
                      <div>
                        <p class="MsoNormal"><span
                            style="color:windowtext">Christian.</span></p>
                      </div>
                    </div>
                    <div>
                      <p class="MsoNormal"><span
                          style="color:windowtext"> </span></p>
                      <div>
                        <p class="MsoNormal"><span
                            style="color:windowtext">Am 27.09.2018 16:27
                            schrieb "Kuehling, Felix" <<a
                              href="mailto:Felix.Kuehling@amd.com"
                              moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:</span></p>
                      </div>
                    </div>
                    <div>
                      <div>
                        <p class="MsoNormal"><span
                            style="color:windowtext">> In this case
                            you can end up accessing pages which are
                            invalidated while get_user_pages is in
                            process.</span></p>
                        <p class="MsoNormal"><span
                            style="color:windowtext"> </span></p>
                        <p class="MsoNormal"><span
                            style="color:windowtext">What’s the sequence
                            of events you have in mind? Something like
                            this?</span></p>
                        <p class="MsoNormal"><span
                            style="color:windowtext"> </span></p>
                        <ul style="margin-top:0in" type="disc">
                          <li class="MsoListParagraph"
                            style="color:windowtext; margin-left:0in">Page
                            table is updated and triggers MMU notifier </li>
                          <li class="MsoListParagraph"
                            style="color:windowtext; margin-left:0in">amdgpu
                            MMU notifier runs and waits for pending CS
                            to finish while holding the read lock </li>
                          <li class="MsoListParagraph"
                            style="color:windowtext; margin-left:0in">New
                            CS starts just after invalidate_range_start
                            MMU notifier finishes but before the page
                            table update is done </li>
                          <li class="MsoListParagraph"
                            style="color:windowtext; margin-left:0in">get_user_pages
                            returns outdated physical addresses </li>
                        </ul>
                        <p class="MsoNormal"><span
                            style="color:windowtext"> </span></p>
                        <p class="MsoNormal"><span
                            style="color:windowtext">I hope that’s not
                            actually possible and that get_user_pages or
                            hmm_vma_fault would block until the page
                            table update is done. That is,
                            invalidate_range_start marks the start of a
                            page table update, and while that update is
                            in progress, get_user_pages or hmm_vma_fault
                            block. Jerome, can you comment on that?</span></p>
                        <p class="MsoNormal"><span
                            style="color:windowtext"> </span></p>
                        <p class="MsoNormal"><span
                            style="color:windowtext">Thanks,<br>
                              Felix</span></p>
                        <p class="MsoNormal"><span
                            style="color:windowtext"> </span></p>
                        <div>
                          <div style="border:none; border-top:solid
                            #E1E1E1 1.0pt; padding:3.0pt 0in 0in 0in">
                            <p class="MsoNormal"><b><span
                                  style="color:windowtext">From:</span></b><span
                                style="color:windowtext"> Koenig,
                                Christian <br>
                                <b>Sent:</b> Thursday, September 27,
                                2018 9:59 AM<br>
                                <b>To:</b> Kuehling, Felix <<a
                                  href="mailto:Felix.Kuehling@amd.com"
                                  moz-do-not-send="true">Felix.Kuehling@amd.com</a>><br>
                                <b>Cc:</b> Yang, Philip <<a
                                  href="mailto:Philip.Yang@amd.com"
                                  moz-do-not-send="true">Philip.Yang@amd.com</a>>;
                                <a
                                  href="mailto:amd-gfx@lists.freedesktop.org"
                                  moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                                Jerome Glisse <<a
                                  href="mailto:j.glisse@gmail.com"
                                  moz-do-not-send="true">j.glisse@gmail.com</a>><br>
                                <b>Subject:</b> RE: [PATCH] drm/amdgpu:
                                use HMM mirror callback to replace mmu
                                notifier v4</span></p>
                          </div>
                        </div>
                        <p class="MsoNormal"> </p>
                        <div>
                          <p class="MsoNormal"><span
                              style="color:windowtext">Yeah I understand
                              that, but again that won't work. </span></p>
                          <div>
                            <p class="MsoNormal"><span
                                style="color:windowtext"> </span></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><span
                                style="color:windowtext">In this case
                                you can end up accessing pages which are
                                invalidated while get_user_pages is in
                                process.</span></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><span
                                style="color:windowtext"> </span></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><span
                                style="color:windowtext">Christian.</span></p>
                          </div>
                        </div>
                        <div>
                          <p class="MsoNormal"><span
                              style="color:windowtext"> </span></p>
                          <div>
                            <p class="MsoNormal"><span
                                style="color:windowtext">Am 27.09.2018
                                15:41 schrieb "Kuehling, Felix" <<a
                                  href="mailto:Felix.Kuehling@amd.com"
                                  moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:</span></p>
                          </div>
                        </div>
                        <div>
                          <div>
                            <p class="MsoNormal"><span
                                style="color:windowtext">> I’m not
                                planning to change that. I don’t think
                                there is any need to change it.</span></p>
                            <p class="MsoNormal">><br>
                              > Yeah, but when HMM doesn't provide
                              both the start and the end hock of the
                              invalidation this way won't work any more.<br>
                              ><br>
                              > So we need to find a solution for
                              this,<br>
                              > Christian.</p>
                            <p class="MsoNormal"> </p>
                            <p class="MsoNormal">My whole argument is
                              that you don’t need to hold the read lock
                              until the invalidate_range_end. Just
                              read_lock and read_unlock in the
                              invalidate_range_start function.</p>
                            <p class="MsoNormal"> </p>
                            <p class="MsoNormal">Regards,</p>
                            <p class="MsoNormal"
                              style="margin-bottom:12.0pt">  Felix</p>
                            <p class="MsoNormal"><span
                                style="color:windowtext"> </span></p>
                            <div>
                              <div style="border:none; border-top:solid
                                #E1E1E1 1.0pt; padding:3.0pt 0in 0in
                                0in">
                                <p class="MsoNormal"><b><span
                                      style="color:windowtext">From:</span></b><span
                                    style="color:windowtext"> Koenig,
                                    Christian <br>
                                    <b>Sent:</b> Thursday, September 27,
                                    2018 9:22 AM<br>
                                    <b>To:</b> Kuehling, Felix <<a
                                      href="mailto:Felix.Kuehling@amd.com"
                                      moz-do-not-send="true">Felix.Kuehling@amd.com</a>><br>
                                    <b>Cc:</b> Yang, Philip <<a
                                      href="mailto:Philip.Yang@amd.com"
                                      moz-do-not-send="true">Philip.Yang@amd.com</a>>;
                                    <a
                                      href="mailto:amd-gfx@lists.freedesktop.org"
                                      moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                                    Jerome Glisse <<a
                                      href="mailto:j.glisse@gmail.com"
                                      moz-do-not-send="true">j.glisse@gmail.com</a>><br>
                                    <b>Subject:</b> Re: [PATCH]
                                    drm/amdgpu: use HMM mirror callback
                                    to replace mmu notifier v4</span></p>
                              </div>
                            </div>
                            <p class="MsoNormal"> </p>
                            <div>
                              <p class="MsoNormal">Am 27.09.2018 um
                                15:18 schrieb Kuehling, Felix:</p>
                            </div>
                            <blockquote style="margin-top:5.0pt;
                              margin-bottom:5.0pt">
                              <p class="MsoNormal">> The problem is
                                here:<br>
                                ></p>
                              <p class="MsoNormal">>
                                ttm_eu_fence_buffer_objects(&p->ticket,
                                &p->validated, p->fence);</p>
                              <p class="MsoNormal">>
                                amdgpu_mn_unlock(p->mn); </p>
                              <p class="MsoNormal"
                                style="margin-bottom:12.0pt">><br>
                                > We need to hold the lock until the
                                fence is added to the reservation
                                object.<br>
                                ><br>
                                > Otherwise somebody could have
                                changed the page tables just in the
                                moment between the check of
                                amdgpu_ttm_tt_userptr_needs_pages() and
                                adding the fence to the reservation
                                object.</p>
                              <p class="MsoNormal"><span
                                  style="color:windowtext">I’m not
                                  planning to change that. I don’t think
                                  there is any need to change it.</span></p>
                            </blockquote>
                            <p class="MsoNormal"
                              style="margin-bottom:12.0pt"><br>
                              Yeah, but when HMM doesn't provide both
                              the start and the end hock of the
                              invalidation this way won't work any more.<br>
                              <br>
                              So we need to find a solution for this,<br>
                              Christian.</p>
                            <blockquote style="margin-top:5.0pt;
                              margin-bottom:5.0pt">
                              <p class="MsoNormal"><span
                                  style="color:windowtext"> </span></p>
                              <p class="MsoNormal"><span
                                  style="color:windowtext">Regards,</span></p>
                              <p class="MsoNormal"><span
                                  style="color:windowtext">  Felix</span></p>
                              <p class="MsoNormal"><span
                                  style="color:windowtext"> </span></p>
                              <div>
                                <div style="border:none;
                                  border-top:solid #E1E1E1 1.0pt;
                                  padding:3.0pt 0in 0in 0in">
                                  <p class="MsoNormal"><b><span
                                        style="color:windowtext">From:</span></b><span
                                      style="color:windowtext"> Koenig,
                                      Christian <br>
                                      <b>Sent:</b> Thursday, September
                                      27, 2018 7:24 AM<br>
                                      <b>To:</b> Kuehling, Felix <a
                                        href="mailto:Felix.Kuehling@amd.com"
                                        moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                                      <b>Cc:</b> Yang, Philip <a
                                        href="mailto:Philip.Yang@amd.com"
                                        moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                                      <a
                                        href="mailto:amd-gfx@lists.freedesktop.org"
                                        moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                                      Jerome Glisse <a
                                        href="mailto:j.glisse@gmail.com"
                                        moz-do-not-send="true"><j.glisse@gmail.com></a><br>
                                      <b>Subject:</b> Re: [PATCH]
                                      drm/amdgpu: use HMM mirror
                                      callback to replace mmu notifier
                                      v4</span></p>
                                </div>
                              </div>
                              <p class="MsoNormal"> </p>
                              <div>
                                <p class="MsoNormal">Am 27.09.2018 um
                                  13:08 schrieb Kuehling, Felix:</p>
                              </div>
                              <blockquote style="margin-top:5.0pt;
                                margin-bottom:5.0pt">
                                <p class="MsoNormal">> We double
                                  check that there wasn't any page table
                                  modification while we prepared the
                                  submission and restart the whole
                                  process when there actually was some
                                  update.<br>
                                  ><br>
                                  > The reason why we need to do this
                                  is here:<br>
                                  ></p>
                                <p class="MsoNormal">>       
                                  ttm_eu_fence_buffer_objects(&p->ticket,
                                  &p->validated, p->fence);<br>
                                  >       
                                  amdgpu_mn_unlock(p->mn);</p>
                                <p class="MsoNormal"
                                  style="margin-bottom:12.0pt">><br>
                                  > Only after the new fence is added
                                  to the buffer object we can release
                                  the lock so that any invalidation will
                                  now block on our command submission to
                                  finish before it modifies the page
                                  table.<br>
                                  <br>
                                </p>
                                <p class="MsoNormal"><span
                                    style="color:windowtext">I don’t see
                                    why this requires holding the
                                    read-lock until
                                    invalidate_range_end.
                                    amdgpu_ttm_tt_affect_userptr gets
                                    called while the mn read-lock is
                                    held in invalidate_range_start
                                    notifier.</span></p>
                              </blockquote>
                              <p class="MsoNormal"
                                style="margin-bottom:12.0pt"><br>
                                That's not related to
                                amdgpu_ttm_tt_affect_userptr(), this
                                function could actually be called
                                outside the lock.<br>
                                <br>
                                The problem is here:</p>
                              <blockquote style="margin-top:5.0pt;
                                margin-bottom:5.0pt">
                                <p class="MsoNormal">ttm_eu_fence_buffer_objects(&p->ticket,
                                  &p->validated, p->fence);</p>
                              </blockquote>
                              <blockquote style="margin-top:5.0pt;
                                margin-bottom:5.0pt">
                                <p class="MsoNormal">amdgpu_mn_unlock(p->mn);
                                </p>
                              </blockquote>
                              <p class="MsoNormal"
                                style="margin-bottom:12.0pt"><br>
                                We need to hold the lock until the fence
                                is added to the reservation object.<br>
                                <br>
                                Otherwise somebody could have changed
                                the page tables just in the moment
                                between the check of
                                amdgpu_ttm_tt_userptr_needs_pages() and
                                adding the fence to the reservation
                                object.<br>
                                <br>
                                Regards,<br>
                                Christian.<br>
                                <br>
                                <br>
                              </p>
                              <blockquote style="margin-top:5.0pt;
                                margin-bottom:5.0pt">
                                <p class="MsoNormal"><span
                                    style="color:windowtext"> </span></p>
                                <p class="MsoNormal"><span
                                    style="color:windowtext">Regards,</span></p>
                                <p class="MsoNormal"><span
                                    style="color:windowtext">  Felix</span></p>
                                <p class="MsoNormal"><span
                                    style="color:windowtext"> </span></p>
                                <div>
                                  <div style="border:none;
                                    border-top:solid #E1E1E1 1.0pt;
                                    padding:3.0pt 0in 0in 0in">
                                    <p class="MsoNormal"><b><span
                                          style="color:windowtext">From:</span></b><span
                                        style="color:windowtext">
                                        Koenig, Christian <br>
                                        <b>Sent:</b> Thursday, September
                                        27, 2018 5:27 AM<br>
                                        <b>To:</b> Kuehling, Felix <a
                                          href="mailto:Felix.Kuehling@amd.com"
                                          moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                                        <b>Cc:</b> Yang, Philip <a
                                          href="mailto:Philip.Yang@amd.com"
                                          moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                                        <a
                                          href="mailto:amd-gfx@lists.freedesktop.org"
                                          moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                                        Jerome Glisse <a
                                          href="mailto:j.glisse@gmail.com"
                                          moz-do-not-send="true"><j.glisse@gmail.com></a><br>
                                        <b>Subject:</b> Re: [PATCH]
                                        drm/amdgpu: use HMM mirror
                                        callback to replace mmu notifier
                                        v4</span></p>
                                  </div>
                                </div>
                                <p class="MsoNormal"> </p>
                                <div>
                                  <p class="MsoNormal"
                                    style="margin-bottom:12.0pt">That is
                                    correct, but take a look what we do
                                    when after calling the
                                    amdgpu_mn_read_lock():<br>
                                    <br>
                                    <br>
                                  </p>
                                  <blockquote style="margin-top:5.0pt;
                                    margin-bottom:5.0pt">
                                    <p class="MsoNormal">        /* No
                                      memory allocation is allowed while
                                      holding the mn lock */<br>
                                              amdgpu_mn_lock(p->mn);<br>
                                             
                                      amdgpu_bo_list_for_each_userptr_entry(e,
                                      p->bo_list) {<br>
                                                      struct amdgpu_bo
                                      *bo =
                                      ttm_to_amdgpu_bo(e->tv.bo);<br>
                                      <br>
                                                      if
                                      (amdgpu_ttm_tt_userptr_needs_pages(bo->tbo.ttm))
                                      {<br>
                                                              r =
                                      -ERESTARTSYS;<br>
                                                              goto
                                      error_abort;<br>
                                                      }<br>
                                              }</p>
                                  </blockquote>
                                  <p class="MsoNormal"
                                    style="margin-bottom:12.0pt"><br>
                                    We double check that there wasn't
                                    any page table modification while we
                                    prepared the submission and restart
                                    the whole process when there
                                    actually was some update.<br>
                                    <br>
                                    The reason why we need to do this is
                                    here:<br>
                                    <br>
                                  </p>
                                  <blockquote style="margin-top:5.0pt;
                                    margin-bottom:5.0pt">
                                    <p class="MsoNormal">       
                                      ttm_eu_fence_buffer_objects(&p->ticket,
                                      &p->validated,
                                      p->fence);<br>
                                             
                                      amdgpu_mn_unlock(p->mn);</p>
                                  </blockquote>
                                  <p class="MsoNormal"><br>
                                    Only after the new fence is added to
                                    the buffer object we can release the
                                    lock so that any invalidation will
                                    now block on our command submission
                                    to finish before it modifies the
                                    page table.<br>
                                    <br>
                                    The only other option would be to
                                    add the fence first and then check
                                    if there was any update to the page
                                    tables.<br>
                                    <br>
                                    The issue with that approach is that
                                    adding a fence can't be made undone,
                                    so if we find that there actually
                                    was an update to the page tables we
                                    would need to somehow turn the CS
                                    into a dummy (e.g. overwrite all IBs
                                    with NOPs or something like that)
                                    and still submit it.<br>
                                    <br>
                                    Not sure if that is actually
                                    possible.<br>
                                    <br>
                                    Regards,<br>
                                    Christian.<br>
                                    <br>
                                    Am 27.09.2018 um 10:47 schrieb
                                    Kuehling, Felix:</p>
                                </div>
                                <blockquote style="margin-top:5.0pt;
                                  margin-bottom:5.0pt">
                                  <p class="MsoNormal">So back to my
                                    previous question:</p>
                                  <p class="MsoPlainText"> </p>
                                  <p class="MsoPlainText">>> But
                                    do we really need another lock for
                                    this? Wouldn't the </p>
                                  <p class="MsoPlainText">>>
                                    re-validation of userptr BOs
                                    (currently calling get_user_pages)
                                    force </p>
                                  <p class="MsoPlainText">>>
                                    synchronization with the ongoing
                                    page table invalidation through the
                                  </p>
                                  <p class="MsoPlainText">>>
                                    mmap_sem or other MM locks?</p>
                                  <p class="MsoPlainText">> </p>
                                  <p class="MsoPlainText">> No and
                                    yes. We don't hold any other locks
                                    while doing command submission, but
                                    I expect that HMM has its own
                                    mechanism to prevent that.</p>
                                  <p class="MsoPlainText">> </p>
                                  <p class="MsoPlainText">> Since we
                                    don't modify
                                    amdgpu_mn_lock()/amdgpu_mn_unlock()
                                    we are certainly not using this
                                    mechanism correctly.</p>
                                  <p class="MsoNormal"> </p>
                                  <p class="MsoNormal">The existing
                                    amdgpu_mn_lock/unlock should block
                                    the MMU notifier while a command
                                    submission is in progress. It should
                                    also block command submission while
                                    an MMU notifier is in progress.</p>
                                  <p class="MsoNormal"> </p>
                                  <p class="MsoNormal">What we lose with
                                    HMM is the ability to hold a
                                    read-lock for the entire duration of
                                    the invalidate_range_start until
                                    invalidate_range_end. As I
                                    understand it, that lock is meant to
                                    prevent new command submissions
                                    while the page tables are being
                                    updated by the kernel. But my point
                                    is, that get_user_pages or
                                    hmm_vma_fault should do the same
                                    kind of thing. Before the command
                                    submission can go ahead, it needs to
                                    update the userptr addresses. If the
                                    page tables are still being updated,
                                    it will block there even without
                                    holding the amdgpu_mn_read_lock.</p>
                                  <p class="MsoNormal"> </p>
                                  <p class="MsoNormal">Regards,</p>
                                  <p class="MsoNormal">  Felix</p>
                                  <p class="MsoNormal"> </p>
                                  <div>
                                    <div style="border:none;
                                      border-top:solid #E1E1E1 1.0pt;
                                      padding:3.0pt 0in 0in 0in">
                                      <p class="MsoNormal"><b>From:</b>
                                        Koenig, Christian <br>
                                        <b>Sent:</b> Thursday, September
                                        27, 2018 3:00 AM<br>
                                        <b>To:</b> Kuehling, Felix <a
                                          href="mailto:Felix.Kuehling@amd.com"
                                          moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                                        <b>Cc:</b> Yang, Philip <a
                                          href="mailto:Philip.Yang@amd.com"
                                          moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                                        <a
                                          href="mailto:amd-gfx@lists.freedesktop.org"
                                          moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                                        Jerome Glisse <a
                                          href="mailto:j.glisse@gmail.com"
                                          moz-do-not-send="true"><j.glisse@gmail.com></a><br>
                                        <b>Subject:</b> RE: [PATCH]
                                        drm/amdgpu: use HMM mirror
                                        callback to replace mmu notifier
                                        v4</p>
                                    </div>
                                  </div>
                                  <p class="MsoNormal"> </p>
                                  <div>
                                    <div>
                                      <p class="MsoNormal">No, that
                                        won't work. We would still run
                                        into lock inversion problems.</p>
                                      <div>
                                        <p class="MsoNormal"> </p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal">What we
                                          could do with the scheduler is
                                          to turn submissions into
                                          dummies if we find that the
                                          page tables are now outdated.</p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"> </p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal">But that
                                          would be really hacky and I'm
                                          not sure if that would really
                                          work in all cases.</p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal"> </p>
                                      </div>
                                      <div>
                                        <p class="MsoNormal">Christian.</p>
                                      </div>
                                    </div>
                                    <div>
                                      <p class="MsoNormal"> </p>
                                      <div>
                                        <p class="MsoNormal">Am
                                          27.09.2018 08:53 schrieb
                                          "Kuehling, Felix" <<a
                                            href="mailto:Felix.Kuehling@amd.com"
                                            moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:</p>
                                      </div>
                                    </div>
                                  </div>
                                  <div>
                                    <p class="MsoNormal">I had a chat
                                      with Jerome yesterday. He pointed
                                      out that the new blockable
                                      parameter can be used to infer
                                      whether the MMU notifier is being
                                      called  in a reclaim operation. So
                                      if blockable==true, it should even
                                      be safe to take the BO reservation
                                      lock without problems. I think
                                      with that we should be able to
                                      remove the read-write locking
                                      completely and go back to locking
                                      (or try-locking for
                                      blockable==false) the reservation
                                      locks in the MMU notifier?<br>
                                      <br>
                                      Regards,<br>
                                        Felix<br>
                                      <br>
                                      -----Original Message-----<br>
                                      From: amd-gfx <<a
                                        href="mailto:amd-gfx-bounces@lists.freedesktop.org"
                                        moz-do-not-send="true">amd-gfx-bounces@lists.freedesktop.org</a>>
                                      On Behalf Of Christian König<br>
                                      Sent: Saturday, September 15, 2018
                                      3:47 AM<br>
                                      To: Kuehling, Felix <<a
                                        href="mailto:Felix.Kuehling@amd.com"
                                        moz-do-not-send="true">Felix.Kuehling@amd.com</a>>;
                                      Yang, Philip <<a
                                        href="mailto:Philip.Yang@amd.com"
                                        moz-do-not-send="true">Philip.Yang@amd.com</a>>;
                                      <a
                                        href="mailto:amd-gfx@lists.freedesktop.org"
                                        moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                                      Jerome Glisse <<a
                                        href="mailto:j.glisse@gmail.com"
                                        moz-do-not-send="true">j.glisse@gmail.com</a>><br>
                                      Subject: Re: [PATCH] drm/amdgpu:
                                      use HMM mirror callback to replace
                                      mmu notifier v4<br>
                                      <br>
                                      Am 14.09.2018 um 22:21 schrieb
                                      Felix Kuehling:<br>
                                      > On 2018-09-14 01:52 PM,
                                      Christian König wrote:<br>
                                      >> Am 14.09.2018 um 19:47
                                      schrieb Philip Yang:<br>
                                      >>> On 2018-09-14 03:51
                                      AM, Christian König wrote:<br>
                                      >>>> Am 13.09.2018 um
                                      23:51 schrieb Felix Kuehling:<br>
                                      >>>>> On 2018-09-13
                                      04:52 PM, Philip Yang wrote:<br>
                                      >>>>> [SNIP]<br>
                                      >>>>>>    +   
                                      amdgpu_mn_read_unlock(amn);<br>
                                      >>>>>> +<br>
                                      >>>>>
                                      amdgpu_mn_read_lock/unlock support
                                      recursive locking for multiple <br>
                                      >>>>> overlapping
                                      or nested invalidation ranges. But
                                      if you'r locking <br>
                                      >>>>> and unlocking
                                      in the same function. Is that
                                      still a concern?<br>
                                      >>> I don't understand
                                      the possible recursive case, but<br>
                                      >>> amdgpu_mn_read_lock()
                                      still support recursive locking.<br>
                                      >>>> Well the real
                                      problem is that unlocking them
                                      here won't work.<br>
                                      >>>><br>
                                      >>>> We need to hold
                                      the lock until we are sure that
                                      the operation which <br>
                                      >>>> updates the page
                                      tables is completed.<br>
                                      >>>><br>
                                      >>> The reason for this
                                      change is because hmm mirror has <br>
                                      >>> invalidate_start
                                      callback, no invalidate_end
                                      callback<br>
                                      >>><br>
                                      >>> Check mmu_notifier.c
                                      and hmm.c again, below is entire
                                      logic to <br>
                                      >>> update CPU page
                                      tables and callback:<br>
                                      >>><br>
                                      >>> mn lock amn->lock
                                      is used to protect interval tree
                                      access because <br>
                                      >>> user may
                                      submit/register new userptr
                                      anytime.<br>
                                      >>> This is same for old
                                      and new way.<br>
                                      >>><br>
                                      >>> step 2 guarantee the
                                      GPU operation is done before
                                      updating CPU page <br>
                                      >>> table.<br>
                                      >>><br>
                                      >>> So I think the change
                                      is safe. We don't need hold mn
                                      lock until the <br>
                                      >>> CPU page tables
                                      update is completed.<br>
                                      >> No, that isn't even
                                      remotely correct. The lock doesn't
                                      protects the <br>
                                      >> interval tree.<br>
                                      >><br>
                                      >>> Old:<br>
                                      >>>     1.
                                      down_read_non_owner(&amn->lock)<br>
                                      >>>     2. loop to handle
                                      BOs from node->bos through
                                      interval tree<br>
                                      >>> amn->object nodes<br>
                                      >>>         gfx: wait for
                                      pending BOs fence operation done,
                                      mark user <br>
                                      >>> pages dirty<br>
                                      >>>         kfd: evict
                                      user queues of the process, wait
                                      for queue <br>
                                      >>> unmap/map operation
                                      done<br>
                                      >>>     3. update CPU
                                      page tables<br>
                                      >>>     4.
                                      up_read(&amn->lock)<br>
                                      >>><br>
                                      >>> New, switch step 3
                                      and 4<br>
                                      >>>     1.
                                      down_read_non_owner(&amn->lock)<br>
                                      >>>     2. loop to handle
                                      BOs from node->bos through
                                      interval tree<br>
                                      >>> amn->object nodes<br>
                                      >>>         gfx: wait for
                                      pending BOs fence operation done,
                                      mark user <br>
                                      >>> pages dirty<br>
                                      >>>         kfd: evict
                                      user queues of the process, wait
                                      for queue <br>
                                      >>> unmap/map operation
                                      done<br>
                                      >>>     3.
                                      up_read(&amn->lock)<br>
                                      >>>     4. update CPU
                                      page tables<br>
                                      >> The lock is there to make
                                      sure that we serialize page table
                                      updates <br>
                                      >> with command submission.<br>
                                      > As I understand it, the idea
                                      is to prevent command submission
                                      (adding <br>
                                      > new fences to BOs) while a
                                      page table invalidation is in
                                      progress.<br>
                                      <br>
                                      Yes, exactly.<br>
                                      <br>
                                      > But do we really need another
                                      lock for this? Wouldn't the <br>
                                      > re-validation of userptr BOs
                                      (currently calling get_user_pages)
                                      force <br>
                                      > synchronization with the
                                      ongoing page table invalidation
                                      through the <br>
                                      > mmap_sem or other MM locks?<br>
                                      <br>
                                      No and yes. We don't hold any
                                      other locks while doing command
                                      submission, but I expect that HMM
                                      has its own mechanism to prevent
                                      that.<br>
                                      <br>
                                      Since we don't modify
                                      amdgpu_mn_lock()/amdgpu_mn_unlock()
                                      we are certainly not using this
                                      mechanism correctly.<br>
                                      <br>
                                      Regards,<br>
                                      Christian.<br>
_______________________________________________<br>
                                      amd-gfx mailing list<br>
                                      <a
                                        href="mailto:amd-gfx@lists.freedesktop.org"
                                        moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
                                      <a
                                        href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx"
                                        moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a></p>
                                  </div>
                                </blockquote>
                                <p class="MsoNormal"> </p>
                              </blockquote>
                              <p class="MsoNormal"> </p>
                            </blockquote>
                            <p class="MsoNormal"> </p>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </blockquote>
                <br>
              </blockquote>
              <br>
            </div>
          </blockquote>
          <br>
        </div>
        <br>
        <fieldset class="mimeAttachmentHeader"></fieldset>
        <br>
        <pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>