<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">I was trying to understand the way how
      HMM handle this concurrent issue and how we handle it in
      amdgpu_ttm_tt_userptr_needs_pages() and 
      amdgpu_ttm_tt_affect_userptr(). HMM uses range->valid flag, we
      use gtt->mmu_invalidations and gtt->last_set_pages. Both use
      the same lock plus flag idea actually.<br>
      <br>
      Thanks for the information, now I understand fence
      ttm_eu_fence_buffer_objects() put to BOs will block CPU page table
      update. This is another side of this concurrent issue I didn't
      know.<br>
      <br>
      I had same worry that it has issue without invalidate_range_end()
      callback as the calling sequence Felix lists. Now I think it's
      fine after taking a look again today because of mm->mmap_sem
      usage, this is my understanding:<br>
      <br>
      A path:<br>
      <br>
      down_write(&mm->mmap_sem);<br>
      mmu_notifier_invalidate_range_start()<br>
          take_lock()<br>
          release_lock()<br>
      CPU page table update<br>
      mmu_notifier_invalidate_range_end()<br>
      up_write(&mm->mmap_sem);<br>
      <br>
      B path:<br>
      <br>
      again: <br>
      down_read(&mm->mmap_sem);<br>
      hmm_vma_get_pfns()<br>
      up_read(&mm->mmap_sem);<br>
      ....<br>
      ....<br>
      take_lock()<br>
      if (!hmm_vma_range_done()) {<br>
         release_lock()<br>
         goto again<br>
      }<br>
      submit command job...<br>
      release_lock()<br>
      <br>
      If you agree, I will submit patch v5 with some minor changes, and
      submit another patch to replace get_user_page() with HMM.<br>
       <br>
      Regards,<br>
      Philip  <br>
      <br>
      On 2018-09-27 11:36 AM, Christian König wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:11ba3857-9bb0-648e-2806-0533090d9a0a@amd.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div class="moz-cite-prefix">Yeah, I've read that as well.<br>
        <br>
        My best guess is that we just need to add a call to
        hmm_vma_range_done() after taking the lock and also replace
        get_user_pages() with hmm_vma_get_pfns().<br>
        <br>
        But I'm still not 100% sure how all of that is supposed to work
        together.<br>
        <br>
        Regards,<br>
        Christian.<br>
        <br>
        Am 27.09.2018 um 16:50 schrieb Kuehling, Felix:<br>
      </div>
      <blockquote type="cite"
cite="mid:DM5PR12MB1707BCD7BFC10EDD594FE90B92140@DM5PR12MB1707.namprd12.prod.outlook.com">
        <meta name="Generator" content="Microsoft Word 15 (filtered
          medium)">
        <style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
        {mso-style-priority:99;
        mso-style-link:"Plain Text Char";
        margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0in;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:.5in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
span.PlainTextChar
        {mso-style-name:"Plain Text Char";
        mso-style-priority:99;
        mso-style-link:"Plain Text";
        font-family:Consolas;
        color:black;}
p.msonormal00, li.msonormal00, div.msonormal00
        {mso-style-name:msonormal0;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
p.msonormal000, li.msonormal000, div.msonormal000
        {mso-style-name:msonormal00;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
p.emailquote, li.emailquote, div.emailquote
        {mso-style-name:emailquote;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:1.0pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
p.msochpdefault, li.msochpdefault, div.msochpdefault
        {mso-style-name:msochpdefault;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:10.0pt;
        font-family:"Calibri",sans-serif;}
span.plaintextchar0
        {mso-style-name:plaintextchar;
        font-family:Consolas;
        color:black;}
span.plaintextchar00
        {mso-style-name:plaintextchar0;
        font-family:"Calibri",sans-serif;}
span.emailstyle22
        {mso-style-name:emailstyle22;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.emailstyle23
        {mso-style-name:emailstyle23;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.emailstyle24
        {mso-style-name:emailstyle24;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.emailstyle25
        {mso-style-name:emailstyle25;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.emailstyle30
        {mso-style-name:emailstyle30;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.EmailStyle33
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:378866647;
        mso-list-template-ids:-897658016;}
@list l0:level1
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l0:level2
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:1.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l0:level3
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:1.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l0:level4
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:2.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l0:level5
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:2.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l0:level6
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:3.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l0:level7
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:3.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l0:level8
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:4.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l0:level9
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:4.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
ol
        {margin-bottom:0in;}
ul
        {margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
        <div class="WordSection1">
          <p class="MsoNormal"><span style="color:windowtext">I think
              the answer is here: <a
href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216"
                moz-do-not-send="true">
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/hmm.rst#n216</a><o:p></o:p></span></p>
          <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
          <p class="MsoNormal"><span style="color:windowtext">Regards,<o:p></o:p></span></p>
          <p class="MsoNormal"><span style="color:windowtext">  Felix<o:p></o:p></span></p>
          <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
          <div>
            <div style="border:none;border-top:solid #E1E1E1
              1.0pt;padding:3.0pt 0in 0in 0in">
              <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span
                  style="color:windowtext"> Koenig, Christian <br>
                  <b>Sent:</b> Thursday, September 27, 2018 10:30 AM<br>
                  <b>To:</b> Kuehling, Felix <a
                    class="moz-txt-link-rfc2396E"
                    href="mailto:Felix.Kuehling@amd.com"
                    moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                  <b>Cc:</b> <a class="moz-txt-link-abbreviated"
                    href="mailto:j.glisse@gmail.com"
                    moz-do-not-send="true">j.glisse@gmail.com</a>; Yang,
                  Philip <a class="moz-txt-link-rfc2396E"
                    href="mailto:Philip.Yang@amd.com"
                    moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                  <a class="moz-txt-link-abbreviated"
                    href="mailto:amd-gfx@lists.freedesktop.org"
                    moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
                  <b>Subject:</b> RE: [PATCH] drm/amdgpu: use HMM mirror
                  callback to replace mmu notifier v4<o:p></o:p></span></p>
            </div>
          </div>
          <p class="MsoNormal"><o:p> </o:p></p>
          <div>
            <p class="MsoNormal"><span style="color:windowtext">At least
                with get_user_pages() that is perfectly possible. <o:p></o:p></span></p>
            <div>
              <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
            </div>
            <div>
              <p class="MsoNormal"><span style="color:windowtext">For
                  HMM it could be that this is prevented somehow.<o:p></o:p></span></p>
            </div>
            <div>
              <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
            </div>
            <div>
              <p class="MsoNormal"><span style="color:windowtext">Christian.<o:p></o:p></span></p>
            </div>
          </div>
          <div>
            <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
            <div>
              <p class="MsoNormal"><span style="color:windowtext">Am
                  27.09.2018 16:27 schrieb "Kuehling, Felix" <<a
                    href="mailto:Felix.Kuehling@amd.com"
                    moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:<o:p></o:p></span></p>
            </div>
          </div>
          <div>
            <div>
              <p class="MsoNormal"><span style="color:windowtext">>
                  In this case you can end up accessing pages which are
                  invalidated while get_user_pages is in process.</span><o:p></o:p></p>
              <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
              <p class="MsoNormal"><span style="color:windowtext">What’s
                  the sequence of events you have in mind? Something
                  like this?</span><o:p></o:p></p>
              <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
              <ul style="margin-top:0in" type="disc">
                <li class="MsoListParagraph"
                  style="color:windowtext;margin-left:0in;mso-list:l0
                  level1 lfo1"> Page table is updated and triggers MMU
                  notifier<o:p></o:p></li>
                <li class="MsoListParagraph"
                  style="color:windowtext;margin-left:0in;mso-list:l0
                  level1 lfo1"> amdgpu MMU notifier runs and waits for
                  pending CS to finish while holding the read lock<o:p></o:p></li>
                <li class="MsoListParagraph"
                  style="color:windowtext;margin-left:0in;mso-list:l0
                  level1 lfo1"> New CS starts just after
                  invalidate_range_start MMU notifier finishes but
                  before the page table update is done<o:p></o:p></li>
                <li class="MsoListParagraph"
                  style="color:windowtext;margin-left:0in;mso-list:l0
                  level1 lfo1"> get_user_pages returns outdated physical
                  addresses<o:p></o:p></li>
              </ul>
              <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
              <p class="MsoNormal"><span style="color:windowtext">I hope
                  that’s not actually possible and that get_user_pages
                  or hmm_vma_fault would block until the page table
                  update is done. That is, invalidate_range_start marks
                  the start of a page table update, and while that
                  update is in progress, get_user_pages or hmm_vma_fault
                  block. Jerome, can you comment on that?</span><o:p></o:p></p>
              <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
              <p class="MsoNormal"><span style="color:windowtext">Thanks,<br>
                    Felix</span><o:p></o:p></p>
              <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
              <div>
                <div style="border:none;border-top:solid #E1E1E1
                  1.0pt;padding:3.0pt 0in 0in 0in">
                  <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span
                      style="color:windowtext"> Koenig, Christian <br>
                      <b>Sent:</b> Thursday, September 27, 2018 9:59 AM<br>
                      <b>To:</b> Kuehling, Felix <<a
                        href="mailto:Felix.Kuehling@amd.com"
                        moz-do-not-send="true">Felix.Kuehling@amd.com</a>><br>
                      <b>Cc:</b> Yang, Philip <<a
                        href="mailto:Philip.Yang@amd.com"
                        moz-do-not-send="true">Philip.Yang@amd.com</a>>;
                      <a href="mailto:amd-gfx@lists.freedesktop.org"
                        moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                      Jerome Glisse <<a
                        href="mailto:j.glisse@gmail.com"
                        moz-do-not-send="true">j.glisse@gmail.com</a>><br>
                      <b>Subject:</b> RE: [PATCH] drm/amdgpu: use HMM
                      mirror callback to replace mmu notifier v4</span><o:p></o:p></p>
                </div>
              </div>
              <p class="MsoNormal"> <o:p></o:p></p>
              <div>
                <p class="MsoNormal"><span style="color:windowtext">Yeah
                    I understand that, but again that won't work. </span><o:p></o:p></p>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext">In
                      this case you can end up accessing pages which are
                      invalidated while get_user_pages is in process.</span><o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext">Christian.</span><o:p></o:p></p>
                </div>
              </div>
              <div>
                <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext">Am
                      27.09.2018 15:41 schrieb "Kuehling, Felix" <<a
                        href="mailto:Felix.Kuehling@amd.com"
                        moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:</span><o:p></o:p></p>
                </div>
              </div>
              <div>
                <div>
                  <p class="MsoNormal"><span style="color:windowtext">>
                      I’m not planning to change that. I don’t think
                      there is any need to change it.</span><o:p></o:p></p>
                  <p class="MsoNormal">><br>
                    > Yeah, but when HMM doesn't provide both the
                    start and the end hock of the invalidation this way
                    won't work any more.<br>
                    ><br>
                    > So we need to find a solution for this,<br>
                    > Christian.<o:p></o:p></p>
                  <p class="MsoNormal"> <o:p></o:p></p>
                  <p class="MsoNormal">My whole argument is that you
                    don’t need to hold the read lock until the
                    invalidate_range_end. Just read_lock and read_unlock
                    in the invalidate_range_start function.<o:p></o:p></p>
                  <p class="MsoNormal"> <o:p></o:p></p>
                  <p class="MsoNormal">Regards,<o:p></o:p></p>
                  <p class="MsoNormal" style="margin-bottom:12.0pt"> 
                    Felix<o:p></o:p></p>
                  <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
                  <div>
                    <div style="border:none;border-top:solid #E1E1E1
                      1.0pt;padding:3.0pt 0in 0in 0in">
                      <p class="MsoNormal"><b><span
                            style="color:windowtext">From:</span></b><span
                          style="color:windowtext"> Koenig, Christian <br>
                          <b>Sent:</b> Thursday, September 27, 2018 9:22
                          AM<br>
                          <b>To:</b> Kuehling, Felix <<a
                            href="mailto:Felix.Kuehling@amd.com"
                            moz-do-not-send="true">Felix.Kuehling@amd.com</a>><br>
                          <b>Cc:</b> Yang, Philip <<a
                            href="mailto:Philip.Yang@amd.com"
                            moz-do-not-send="true">Philip.Yang@amd.com</a>>;
                          <a href="mailto:amd-gfx@lists.freedesktop.org"
                            moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                          Jerome Glisse <<a
                            href="mailto:j.glisse@gmail.com"
                            moz-do-not-send="true">j.glisse@gmail.com</a>><br>
                          <b>Subject:</b> Re: [PATCH] drm/amdgpu: use
                          HMM mirror callback to replace mmu notifier v4</span><o:p></o:p></p>
                    </div>
                  </div>
                  <p class="MsoNormal"> <o:p></o:p></p>
                  <div>
                    <p class="MsoNormal">Am 27.09.2018 um 15:18 schrieb
                      Kuehling, Felix:<o:p></o:p></p>
                  </div>
                  <blockquote
                    style="margin-top:5.0pt;margin-bottom:5.0pt">
                    <p class="MsoNormal">> The problem is here:<br>
                      ><o:p></o:p></p>
                    <p class="MsoNormal">>
                      ttm_eu_fence_buffer_objects(&p->ticket,
                      &p->validated, p->fence);<o:p></o:p></p>
                    <p class="MsoNormal">>
                      amdgpu_mn_unlock(p->mn); <o:p></o:p></p>
                    <p class="MsoNormal" style="margin-bottom:12.0pt">><br>
                      > We need to hold the lock until the fence is
                      added to the reservation object.<br>
                      ><br>
                      > Otherwise somebody could have changed the
                      page tables just in the moment between the check
                      of amdgpu_ttm_tt_userptr_needs_pages() and adding
                      the fence to the reservation object.<o:p></o:p></p>
                    <p class="MsoNormal"><span style="color:windowtext">I’m
                        not planning to change that. I don’t think there
                        is any need to change it.</span><o:p></o:p></p>
                  </blockquote>
                  <p class="MsoNormal" style="margin-bottom:12.0pt"><br>
                    Yeah, but when HMM doesn't provide both the start
                    and the end hock of the invalidation this way won't
                    work any more.<br>
                    <br>
                    So we need to find a solution for this,<br>
                    Christian.<o:p></o:p></p>
                  <blockquote
                    style="margin-top:5.0pt;margin-bottom:5.0pt">
                    <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
                    <p class="MsoNormal"><span style="color:windowtext">Regards,</span><o:p></o:p></p>
                    <p class="MsoNormal"><span style="color:windowtext"> 
                        Felix</span><o:p></o:p></p>
                    <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
                    <div>
                      <div style="border:none;border-top:solid #E1E1E1
                        1.0pt;padding:3.0pt 0in 0in 0in">
                        <p class="MsoNormal"><b><span
                              style="color:windowtext">From:</span></b><span
                            style="color:windowtext"> Koenig, Christian
                            <br>
                            <b>Sent:</b> Thursday, September 27, 2018
                            7:24 AM<br>
                            <b>To:</b> Kuehling, Felix <a
                              href="mailto:Felix.Kuehling@amd.com"
                              moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                            <b>Cc:</b> Yang, Philip <a
                              href="mailto:Philip.Yang@amd.com"
                              moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                            <a
                              href="mailto:amd-gfx@lists.freedesktop.org"
                              moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                            Jerome Glisse <a
                              href="mailto:j.glisse@gmail.com"
                              moz-do-not-send="true"><j.glisse@gmail.com></a><br>
                            <b>Subject:</b> Re: [PATCH] drm/amdgpu: use
                            HMM mirror callback to replace mmu notifier
                            v4</span><o:p></o:p></p>
                      </div>
                    </div>
                    <p class="MsoNormal"> <o:p></o:p></p>
                    <div>
                      <p class="MsoNormal">Am 27.09.2018 um 13:08
                        schrieb Kuehling, Felix:<o:p></o:p></p>
                    </div>
                    <blockquote
                      style="margin-top:5.0pt;margin-bottom:5.0pt">
                      <p class="MsoNormal">> We double check that
                        there wasn't any page table modification while
                        we prepared the submission and restart the whole
                        process when there actually was some update.<br>
                        ><br>
                        > The reason why we need to do this is here:<br>
                        ><o:p></o:p></p>
                      <p class="MsoNormal">>       
                        ttm_eu_fence_buffer_objects(&p->ticket,
                        &p->validated, p->fence);<br>
                        >        amdgpu_mn_unlock(p->mn);<o:p></o:p></p>
                      <p class="MsoNormal" style="margin-bottom:12.0pt">><br>
                        > Only after the new fence is added to the
                        buffer object we can release the lock so that
                        any invalidation will now block on our command
                        submission to finish before it modifies the page
                        table.<br>
                        <br>
                        <o:p></o:p></p>
                      <p class="MsoNormal"><span
                          style="color:windowtext">I don’t see why this
                          requires holding the read-lock until
                          invalidate_range_end.
                          amdgpu_ttm_tt_affect_userptr gets called while
                          the mn read-lock is held in
                          invalidate_range_start notifier.</span><o:p></o:p></p>
                    </blockquote>
                    <p class="MsoNormal" style="margin-bottom:12.0pt"><br>
                      That's not related to
                      amdgpu_ttm_tt_affect_userptr(), this function
                      could actually be called outside the lock.<br>
                      <br>
                      The problem is here:<o:p></o:p></p>
                    <blockquote
                      style="margin-top:5.0pt;margin-bottom:5.0pt">
                      <p class="MsoNormal">ttm_eu_fence_buffer_objects(&p->ticket,
                        &p->validated, p->fence);<o:p></o:p></p>
                    </blockquote>
                    <blockquote
                      style="margin-top:5.0pt;margin-bottom:5.0pt">
                      <p class="MsoNormal">amdgpu_mn_unlock(p->mn); <o:p></o:p></p>
                    </blockquote>
                    <p class="MsoNormal" style="margin-bottom:12.0pt"><br>
                      We need to hold the lock until the fence is added
                      to the reservation object.<br>
                      <br>
                      Otherwise somebody could have changed the page
                      tables just in the moment between the check of
                      amdgpu_ttm_tt_userptr_needs_pages() and adding the
                      fence to the reservation object.<br>
                      <br>
                      Regards,<br>
                      Christian.<br>
                      <br>
                      <br>
                      <o:p></o:p></p>
                    <blockquote
                      style="margin-top:5.0pt;margin-bottom:5.0pt">
                      <p class="MsoNormal"><span
                          style="color:windowtext"> </span><o:p></o:p></p>
                      <p class="MsoNormal"><span
                          style="color:windowtext">Regards,</span><o:p></o:p></p>
                      <p class="MsoNormal"><span
                          style="color:windowtext">  Felix</span><o:p></o:p></p>
                      <p class="MsoNormal"><span
                          style="color:windowtext"> </span><o:p></o:p></p>
                      <div>
                        <div style="border:none;border-top:solid #E1E1E1
                          1.0pt;padding:3.0pt 0in 0in 0in">
                          <p class="MsoNormal"><b><span
                                style="color:windowtext">From:</span></b><span
                              style="color:windowtext"> Koenig,
                              Christian <br>
                              <b>Sent:</b> Thursday, September 27, 2018
                              5:27 AM<br>
                              <b>To:</b> Kuehling, Felix <a
                                href="mailto:Felix.Kuehling@amd.com"
                                moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                              <b>Cc:</b> Yang, Philip <a
                                href="mailto:Philip.Yang@amd.com"
                                moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                              <a
                                href="mailto:amd-gfx@lists.freedesktop.org"
                                moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                              Jerome Glisse <a
                                href="mailto:j.glisse@gmail.com"
                                moz-do-not-send="true"><j.glisse@gmail.com></a><br>
                              <b>Subject:</b> Re: [PATCH] drm/amdgpu:
                              use HMM mirror callback to replace mmu
                              notifier v4</span><o:p></o:p></p>
                        </div>
                      </div>
                      <p class="MsoNormal"> <o:p></o:p></p>
                      <div>
                        <p class="MsoNormal"
                          style="margin-bottom:12.0pt">That is correct,
                          but take a look what we do when after calling
                          the amdgpu_mn_read_lock():<br>
                          <br>
                          <br>
                          <o:p></o:p></p>
                        <blockquote
                          style="margin-top:5.0pt;margin-bottom:5.0pt">
                          <p class="MsoNormal">        /* No memory
                            allocation is allowed while holding the mn
                            lock */<br>
                                    amdgpu_mn_lock(p->mn);<br>
                                   
                            amdgpu_bo_list_for_each_userptr_entry(e,
                            p->bo_list) {<br>
                                            struct amdgpu_bo *bo =
                            ttm_to_amdgpu_bo(e->tv.bo);<br>
                            <br>
                                            if
                            (amdgpu_ttm_tt_userptr_needs_pages(bo->tbo.ttm))
                            {<br>
                                                    r = -ERESTARTSYS;<br>
                                                    goto error_abort;<br>
                                            }<br>
                                    }<o:p></o:p></p>
                        </blockquote>
                        <p class="MsoNormal"
                          style="margin-bottom:12.0pt"><br>
                          We double check that there wasn't any page
                          table modification while we prepared the
                          submission and restart the whole process when
                          there actually was some update.<br>
                          <br>
                          The reason why we need to do this is here:<br>
                          <br>
                          <o:p></o:p></p>
                        <blockquote
                          style="margin-top:5.0pt;margin-bottom:5.0pt">
                          <p class="MsoNormal">       
                            ttm_eu_fence_buffer_objects(&p->ticket,
                            &p->validated, p->fence);<br>
                                    amdgpu_mn_unlock(p->mn);<o:p></o:p></p>
                        </blockquote>
                        <p class="MsoNormal"><br>
                          Only after the new fence is added to the
                          buffer object we can release the lock so that
                          any invalidation will now block on our command
                          submission to finish before it modifies the
                          page table.<br>
                          <br>
                          The only other option would be to add the
                          fence first and then check if there was any
                          update to the page tables.<br>
                          <br>
                          The issue with that approach is that adding a
                          fence can't be made undone, so if we find that
                          there actually was an update to the page
                          tables we would need to somehow turn the CS
                          into a dummy (e.g. overwrite all IBs with NOPs
                          or something like that) and still submit it.<br>
                          <br>
                          Not sure if that is actually possible.<br>
                          <br>
                          Regards,<br>
                          Christian.<br>
                          <br>
                          Am 27.09.2018 um 10:47 schrieb Kuehling,
                          Felix:<o:p></o:p></p>
                      </div>
                      <blockquote
                        style="margin-top:5.0pt;margin-bottom:5.0pt">
                        <p class="MsoNormal">So back to my previous
                          question:<o:p></o:p></p>
                        <p class="MsoPlainText"> <o:p></o:p></p>
                        <p class="MsoPlainText">>> But do we
                          really need another lock for this? Wouldn't
                          the <o:p></o:p></p>
                        <p class="MsoPlainText">>> re-validation
                          of userptr BOs (currently calling
                          get_user_pages) force <o:p></o:p></p>
                        <p class="MsoPlainText">>> synchronization
                          with the ongoing page table invalidation
                          through the <o:p></o:p></p>
                        <p class="MsoPlainText">>> mmap_sem or
                          other MM locks?<o:p></o:p></p>
                        <p class="MsoPlainText">> <o:p></o:p></p>
                        <p class="MsoPlainText">> No and yes. We
                          don't hold any other locks while doing command
                          submission, but I expect that HMM has its own
                          mechanism to prevent that.<o:p></o:p></p>
                        <p class="MsoPlainText">> <o:p></o:p></p>
                        <p class="MsoPlainText">> Since we don't
                          modify amdgpu_mn_lock()/amdgpu_mn_unlock() we
                          are certainly not using this mechanism
                          correctly.<o:p></o:p></p>
                        <p class="MsoNormal"> <o:p></o:p></p>
                        <p class="MsoNormal">The existing
                          amdgpu_mn_lock/unlock should block the MMU
                          notifier while a command submission is in
                          progress. It should also block command
                          submission while an MMU notifier is in
                          progress.<o:p></o:p></p>
                        <p class="MsoNormal"> <o:p></o:p></p>
                        <p class="MsoNormal">What we lose with HMM is
                          the ability to hold a read-lock for the entire
                          duration of the invalidate_range_start until
                          invalidate_range_end. As I understand it, that
                          lock is meant to prevent new command
                          submissions while the page tables are being
                          updated by the kernel. But my point is, that
                          get_user_pages or hmm_vma_fault should do the
                          same kind of thing. Before the command
                          submission can go ahead, it needs to update
                          the userptr addresses. If the page tables are
                          still being updated, it will block there even
                          without holding the amdgpu_mn_read_lock.<o:p></o:p></p>
                        <p class="MsoNormal"> <o:p></o:p></p>
                        <p class="MsoNormal">Regards,<o:p></o:p></p>
                        <p class="MsoNormal">  Felix<o:p></o:p></p>
                        <p class="MsoNormal"> <o:p></o:p></p>
                        <div>
                          <div style="border:none;border-top:solid
                            #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
                            <p class="MsoNormal"><b>From:</b> Koenig,
                              Christian <br>
                              <b>Sent:</b> Thursday, September 27, 2018
                              3:00 AM<br>
                              <b>To:</b> Kuehling, Felix <a
                                href="mailto:Felix.Kuehling@amd.com"
                                moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                              <b>Cc:</b> Yang, Philip <a
                                href="mailto:Philip.Yang@amd.com"
                                moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                              <a
                                href="mailto:amd-gfx@lists.freedesktop.org"
                                moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                              Jerome Glisse <a
                                href="mailto:j.glisse@gmail.com"
                                moz-do-not-send="true"><j.glisse@gmail.com></a><br>
                              <b>Subject:</b> RE: [PATCH] drm/amdgpu:
                              use HMM mirror callback to replace mmu
                              notifier v4<o:p></o:p></p>
                          </div>
                        </div>
                        <p class="MsoNormal"> <o:p></o:p></p>
                        <div>
                          <div>
                            <p class="MsoNormal">No, that won't work. We
                              would still run into lock inversion
                              problems.<o:p></o:p></p>
                            <div>
                              <p class="MsoNormal"> <o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal">What we could do with
                                the scheduler is to turn submissions
                                into dummies if we find that the page
                                tables are now outdated.<o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"> <o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal">But that would be
                                really hacky and I'm not sure if that
                                would really work in all cases.<o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal"> <o:p></o:p></p>
                            </div>
                            <div>
                              <p class="MsoNormal">Christian.<o:p></o:p></p>
                            </div>
                          </div>
                          <div>
                            <p class="MsoNormal"> <o:p></o:p></p>
                            <div>
                              <p class="MsoNormal">Am 27.09.2018 08:53
                                schrieb "Kuehling, Felix" <<a
                                  href="mailto:Felix.Kuehling@amd.com"
                                  moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:<o:p></o:p></p>
                            </div>
                          </div>
                        </div>
                        <div>
                          <p class="MsoNormal">I had a chat with Jerome
                            yesterday. He pointed out that the new
                            blockable parameter can be used to infer
                            whether the MMU notifier is being called  in
                            a reclaim operation. So if blockable==true,
                            it should even be safe to take the BO
                            reservation lock without problems. I think
                            with that we should be able to remove the
                            read-write locking completely and go back to
                            locking (or try-locking for
                            blockable==false) the reservation locks in
                            the MMU notifier?<br>
                            <br>
                            Regards,<br>
                              Felix<br>
                            <br>
                            -----Original Message-----<br>
                            From: amd-gfx <<a
                              href="mailto:amd-gfx-bounces@lists.freedesktop.org"
                              moz-do-not-send="true">amd-gfx-bounces@lists.freedesktop.org</a>>
                            On Behalf Of Christian König<br>
                            Sent: Saturday, September 15, 2018 3:47 AM<br>
                            To: Kuehling, Felix <<a
                              href="mailto:Felix.Kuehling@amd.com"
                              moz-do-not-send="true">Felix.Kuehling@amd.com</a>>;
                            Yang, Philip <<a
                              href="mailto:Philip.Yang@amd.com"
                              moz-do-not-send="true">Philip.Yang@amd.com</a>>;
                            <a
                              href="mailto:amd-gfx@lists.freedesktop.org"
                              moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                            Jerome Glisse <<a
                              href="mailto:j.glisse@gmail.com"
                              moz-do-not-send="true">j.glisse@gmail.com</a>><br>
                            Subject: Re: [PATCH] drm/amdgpu: use HMM
                            mirror callback to replace mmu notifier v4<br>
                            <br>
                            Am 14.09.2018 um 22:21 schrieb Felix
                            Kuehling:<br>
                            > On 2018-09-14 01:52 PM, Christian König
                            wrote:<br>
                            >> Am 14.09.2018 um 19:47 schrieb
                            Philip Yang:<br>
                            >>> On 2018-09-14 03:51 AM,
                            Christian König wrote:<br>
                            >>>> Am 13.09.2018 um 23:51
                            schrieb Felix Kuehling:<br>
                            >>>>> On 2018-09-13 04:52 PM,
                            Philip Yang wrote:<br>
                            >>>>> [SNIP]<br>
                            >>>>>>    +   
                            amdgpu_mn_read_unlock(amn);<br>
                            >>>>>> +<br>
                            >>>>>
                            amdgpu_mn_read_lock/unlock support recursive
                            locking for multiple <br>
                            >>>>> overlapping or nested
                            invalidation ranges. But if you'r locking <br>
                            >>>>> and unlocking in the
                            same function. Is that still a concern?<br>
                            >>> I don't understand the possible
                            recursive case, but<br>
                            >>> amdgpu_mn_read_lock() still
                            support recursive locking.<br>
                            >>>> Well the real problem is
                            that unlocking them here won't work.<br>
                            >>>><br>
                            >>>> We need to hold the lock
                            until we are sure that the operation which <br>
                            >>>> updates the page tables is
                            completed.<br>
                            >>>><br>
                            >>> The reason for this change is
                            because hmm mirror has <br>
                            >>> invalidate_start callback, no
                            invalidate_end callback<br>
                            >>><br>
                            >>> Check mmu_notifier.c and hmm.c
                            again, below is entire logic to <br>
                            >>> update CPU page tables and
                            callback:<br>
                            >>><br>
                            >>> mn lock amn->lock is used to
                            protect interval tree access because <br>
                            >>> user may submit/register new
                            userptr anytime.<br>
                            >>> This is same for old and new
                            way.<br>
                            >>><br>
                            >>> step 2 guarantee the GPU
                            operation is done before updating CPU page <br>
                            >>> table.<br>
                            >>><br>
                            >>> So I think the change is safe.
                            We don't need hold mn lock until the <br>
                            >>> CPU page tables update is
                            completed.<br>
                            >> No, that isn't even remotely
                            correct. The lock doesn't protects the <br>
                            >> interval tree.<br>
                            >><br>
                            >>> Old:<br>
                            >>>     1.
                            down_read_non_owner(&amn->lock)<br>
                            >>>     2. loop to handle BOs from
                            node->bos through interval tree<br>
                            >>> amn->object nodes<br>
                            >>>         gfx: wait for pending
                            BOs fence operation done, mark user <br>
                            >>> pages dirty<br>
                            >>>         kfd: evict user queues
                            of the process, wait for queue <br>
                            >>> unmap/map operation done<br>
                            >>>     3. update CPU page tables<br>
                            >>>     4.
                            up_read(&amn->lock)<br>
                            >>><br>
                            >>> New, switch step 3 and 4<br>
                            >>>     1.
                            down_read_non_owner(&amn->lock)<br>
                            >>>     2. loop to handle BOs from
                            node->bos through interval tree<br>
                            >>> amn->object nodes<br>
                            >>>         gfx: wait for pending
                            BOs fence operation done, mark user <br>
                            >>> pages dirty<br>
                            >>>         kfd: evict user queues
                            of the process, wait for queue <br>
                            >>> unmap/map operation done<br>
                            >>>     3.
                            up_read(&amn->lock)<br>
                            >>>     4. update CPU page tables<br>
                            >> The lock is there to make sure that
                            we serialize page table updates <br>
                            >> with command submission.<br>
                            > As I understand it, the idea is to
                            prevent command submission (adding <br>
                            > new fences to BOs) while a page table
                            invalidation is in progress.<br>
                            <br>
                            Yes, exactly.<br>
                            <br>
                            > But do we really need another lock for
                            this? Wouldn't the <br>
                            > re-validation of userptr BOs (currently
                            calling get_user_pages) force <br>
                            > synchronization with the ongoing page
                            table invalidation through the <br>
                            > mmap_sem or other MM locks?<br>
                            <br>
                            No and yes. We don't hold any other locks
                            while doing command submission, but I expect
                            that HMM has its own mechanism to prevent
                            that.<br>
                            <br>
                            Since we don't modify
                            amdgpu_mn_lock()/amdgpu_mn_unlock() we are
                            certainly not using this mechanism
                            correctly.<br>
                            <br>
                            Regards,<br>
                            Christian.<br>
_______________________________________________<br>
                            amd-gfx mailing list<br>
                            <a
                              href="mailto:amd-gfx@lists.freedesktop.org"
                              moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
                            <a
                              href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx"
                              moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><o:p></o:p></p>
                        </div>
                      </blockquote>
                      <p class="MsoNormal"> <o:p></o:p></p>
                    </blockquote>
                    <p class="MsoNormal"> <o:p></o:p></p>
                  </blockquote>
                  <p class="MsoNormal"> <o:p></o:p></p>
                </div>
              </div>
            </div>
          </div>
        </div>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>