<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Am 27.09.2018 um 15:18 schrieb
      Kuehling, Felix:<br>
    </div>
    <blockquote type="cite"
cite="mid:DM5PR12MB17072AE77EB0DE3AD22B8D7892140@DM5PR12MB1707.namprd12.prod.outlook.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
        {mso-style-priority:99;
        mso-style-link:"Plain Text Char";
        margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0in;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:.5in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
span.PlainTextChar
        {mso-style-name:"Plain Text Char";
        mso-style-priority:99;
        mso-style-link:"Plain Text";
        font-family:"Calibri",sans-serif;}
p.emailquote, li.emailquote, div.emailquote
        {mso-style-name:emailquote;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:1.0pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
span.EmailStyle22
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.EmailStyle23
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.EmailStyle24
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoNormal">> The problem is here:<br>
          ><o:p></o:p></p>
        <p class="MsoNormal">>
          ttm_eu_fence_buffer_objects(&p->ticket,
          &p->validated, p->fence);<o:p></o:p></p>
        <p class="MsoNormal">> amdgpu_mn_unlock(p->mn); <o:p></o:p></p>
        <p class="MsoNormal">><br>
          > We need to hold the lock until the fence is added to the
          reservation object.<br>
          ><br>
          > Otherwise somebody could have changed the page tables
          just in the moment between the check of
          amdgpu_ttm_tt_userptr_needs_pages() and adding the fence to
          the reservation object.<br>
          <br>
          <span style="color:windowtext"><o:p></o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext">I’m not
            planning to change that. I don’t think there is any need to
            change it.</span></p>
      </div>
    </blockquote>
    <br>
    Yeah, but when HMM doesn't provide both the start and the end hock
    of the invalidation this way won't work any more.<br>
    <br>
    So we need to find a solution for this,<br>
    Christian.<br>
    <br>
    <blockquote type="cite"
cite="mid:DM5PR12MB17072AE77EB0DE3AD22B8D7892140@DM5PR12MB1707.namprd12.prod.outlook.com">
      <div class="WordSection1">
        <p class="MsoNormal"><span style="color:windowtext"><o:p></o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext">Regards,<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext">  Felix<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
        <div>
          <div style="border:none;border-top:solid #E1E1E1
            1.0pt;padding:3.0pt 0in 0in 0in">
            <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span
                style="color:windowtext"> Koenig, Christian
                <br>
                <b>Sent:</b> Thursday, September 27, 2018 7:24 AM<br>
                <b>To:</b> Kuehling, Felix
                <a class="moz-txt-link-rfc2396E" href="mailto:Felix.Kuehling@amd.com"><Felix.Kuehling@amd.com></a><br>
                <b>Cc:</b> Yang, Philip <a class="moz-txt-link-rfc2396E" href="mailto:Philip.Yang@amd.com"><Philip.Yang@amd.com></a>;
                <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>; Jerome Glisse
                <a class="moz-txt-link-rfc2396E" href="mailto:j.glisse@gmail.com"><j.glisse@gmail.com></a><br>
                <b>Subject:</b> Re: [PATCH] drm/amdgpu: use HMM mirror
                callback to replace mmu notifier v4<o:p></o:p></span></p>
          </div>
        </div>
        <p class="MsoNormal"><o:p> </o:p></p>
        <div>
          <p class="MsoNormal">Am 27.09.2018 um 13:08 schrieb Kuehling,
            Felix:<o:p></o:p></p>
        </div>
        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
          <p class="MsoNormal">> We double check that there wasn't
            any page table modification while we prepared the submission
            and restart the whole process when there actually was some
            update.<br>
            ><br>
            > The reason why we need to do this is here:<br>
            ><o:p></o:p></p>
          <p class="MsoNormal">>       
            ttm_eu_fence_buffer_objects(&p->ticket,
            &p->validated, p->fence);<br>
            >        amdgpu_mn_unlock(p->mn);<o:p></o:p></p>
          <p class="MsoNormal">><br>
            > Only after the new fence is added to the buffer object
            we can release the lock so that any invalidation will now
            block on our command submission to finish before it modifies
            the page table.<br>
            <br>
            <br>
            <o:p></o:p></p>
          <p class="MsoNormal"><span style="color:windowtext">I don’t
              see why this requires holding the read-lock until
              invalidate_range_end. amdgpu_ttm_tt_affect_userptr gets
              called while the mn read-lock is held in
              invalidate_range_start notifier.</span><o:p></o:p></p>
        </blockquote>
        <p class="MsoNormal"><br>
          That's not related to amdgpu_ttm_tt_affect_userptr(), this
          function could actually be called outside the lock.<br>
          <br>
          The problem is here:<br>
          <br>
          <o:p></o:p></p>
        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
          <p class="MsoNormal">ttm_eu_fence_buffer_objects(&p->ticket,
            &p->validated, p->fence);<o:p></o:p></p>
        </blockquote>
        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
          <p class="MsoNormal">amdgpu_mn_unlock(p->mn); <o:p></o:p></p>
        </blockquote>
        <p class="MsoNormal"><br>
          We need to hold the lock until the fence is added to the
          reservation object.<br>
          <br>
          Otherwise somebody could have changed the page tables just in
          the moment between the check of
          amdgpu_ttm_tt_userptr_needs_pages() and adding the fence to
          the reservation object.<br>
          <br>
          Regards,<br>
          Christian.<br>
          <br>
          <br>
          <br>
          <o:p></o:p></p>
        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
          <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
          <p class="MsoNormal"><span style="color:windowtext">Regards,</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="color:windowtext">  Felix</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
          <div>
            <div style="border:none;border-top:solid #E1E1E1
              1.0pt;padding:3.0pt 0in 0in 0in">
              <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span
                  style="color:windowtext"> Koenig, Christian
                  <br>
                  <b>Sent:</b> Thursday, September 27, 2018 5:27 AM<br>
                  <b>To:</b> Kuehling, Felix <a
                    href="mailto:Felix.Kuehling@amd.com"
                    moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                  <b>Cc:</b> Yang, Philip <a
                    href="mailto:Philip.Yang@amd.com"
                    moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                  <a href="mailto:amd-gfx@lists.freedesktop.org"
                    moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                  Jerome Glisse
                  <a href="mailto:j.glisse@gmail.com"
                    moz-do-not-send="true"><j.glisse@gmail.com></a><br>
                  <b>Subject:</b> Re: [PATCH] drm/amdgpu: use HMM mirror
                  callback to replace mmu notifier v4</span><o:p></o:p></p>
            </div>
          </div>
          <p class="MsoNormal"> <o:p></o:p></p>
          <div>
            <p class="MsoNormal">That is correct, but take a look what
              we do when after calling the amdgpu_mn_read_lock():<br>
              <br>
              <br>
              <br>
              <o:p></o:p></p>
            <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
              <p class="MsoNormal">        /* No memory allocation is
                allowed while holding the mn lock */<br>
                        amdgpu_mn_lock(p->mn);<br>
                        amdgpu_bo_list_for_each_userptr_entry(e,
                p->bo_list) {<br>
                                struct amdgpu_bo *bo =
                ttm_to_amdgpu_bo(e->tv.bo);<br>
                <br>
                                if
                (amdgpu_ttm_tt_userptr_needs_pages(bo->tbo.ttm)) {<br>
                                        r = -ERESTARTSYS;<br>
                                        goto error_abort;<br>
                                }<br>
                        }<o:p></o:p></p>
            </blockquote>
            <p class="MsoNormal"><br>
              We double check that there wasn't any page table
              modification while we prepared the submission and restart
              the whole process when there actually was some update.<br>
              <br>
              The reason why we need to do this is here:<br>
              <br>
              <br>
              <o:p></o:p></p>
            <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
              <p class="MsoNormal">       
                ttm_eu_fence_buffer_objects(&p->ticket,
                &p->validated, p->fence);<br>
                        amdgpu_mn_unlock(p->mn);<o:p></o:p></p>
            </blockquote>
            <p class="MsoNormal"><br>
              Only after the new fence is added to the buffer object we
              can release the lock so that any invalidation will now
              block on our command submission to finish before it
              modifies the page table.<br>
              <br>
              The only other option would be to add the fence first and
              then check if there was any update to the page tables.<br>
              <br>
              The issue with that approach is that adding a fence can't
              be made undone, so if we find that there actually was an
              update to the page tables we would need to somehow turn
              the CS into a dummy (e.g. overwrite all IBs with NOPs or
              something like that) and still submit it.<br>
              <br>
              Not sure if that is actually possible.<br>
              <br>
              Regards,<br>
              Christian.<br>
              <br>
              Am 27.09.2018 um 10:47 schrieb Kuehling, Felix:<o:p></o:p></p>
          </div>
          <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
            <p class="MsoNormal">So back to my previous question:<o:p></o:p></p>
            <p class="MsoPlainText"> <o:p></o:p></p>
            <p class="MsoPlainText">>> But do we really need
              another lock for this? Wouldn't the
              <o:p></o:p></p>
            <p class="MsoPlainText">>> re-validation of userptr
              BOs (currently calling get_user_pages) force
              <o:p></o:p></p>
            <p class="MsoPlainText">>> synchronization with the
              ongoing page table invalidation through the
              <o:p></o:p></p>
            <p class="MsoPlainText">>> mmap_sem or other MM locks?<o:p></o:p></p>
            <p class="MsoPlainText">> <o:p></o:p></p>
            <p class="MsoPlainText">> No and yes. We don't hold any
              other locks while doing command submission, but I expect
              that HMM has its own mechanism to prevent that.<o:p></o:p></p>
            <p class="MsoPlainText">> <o:p></o:p></p>
            <p class="MsoPlainText">> Since we don't modify
              amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly not
              using this mechanism correctly.<o:p></o:p></p>
            <p class="MsoNormal"> <o:p></o:p></p>
            <p class="MsoNormal">The existing amdgpu_mn_lock/unlock
              should block the MMU notifier while a command submission
              is in progress. It should also block command submission
              while an MMU notifier is in progress.<o:p></o:p></p>
            <p class="MsoNormal"> <o:p></o:p></p>
            <p class="MsoNormal">What we lose with HMM is the ability to
              hold a read-lock for the entire duration of the
              invalidate_range_start until invalidate_range_end. As I
              understand it, that lock is meant to prevent new command
              submissions while the page tables are being updated by the
              kernel. But my point is, that get_user_pages or
              hmm_vma_fault should do the same kind of thing. Before the
              command submission can go ahead, it needs to update the
              userptr addresses. If the page tables are still being
              updated, it will block there even without holding the
              amdgpu_mn_read_lock.<o:p></o:p></p>
            <p class="MsoNormal"> <o:p></o:p></p>
            <p class="MsoNormal">Regards,<o:p></o:p></p>
            <p class="MsoNormal">  Felix<o:p></o:p></p>
            <p class="MsoNormal"> <o:p></o:p></p>
            <div>
              <div style="border:none;border-top:solid #E1E1E1
                1.0pt;padding:3.0pt 0in 0in 0in">
                <p class="MsoNormal"><b>From:</b> Koenig, Christian <br>
                  <b>Sent:</b> Thursday, September 27, 2018 3:00 AM<br>
                  <b>To:</b> Kuehling, Felix <a
                    href="mailto:Felix.Kuehling@amd.com"
                    moz-do-not-send="true"><Felix.Kuehling@amd.com></a><br>
                  <b>Cc:</b> Yang, Philip <a
                    href="mailto:Philip.Yang@amd.com"
                    moz-do-not-send="true"><Philip.Yang@amd.com></a>;
                  <a href="mailto:amd-gfx@lists.freedesktop.org"
                    moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                  Jerome Glisse
                  <a href="mailto:j.glisse@gmail.com"
                    moz-do-not-send="true"><j.glisse@gmail.com></a><br>
                  <b>Subject:</b> RE: [PATCH] drm/amdgpu: use HMM mirror
                  callback to replace mmu notifier v4<o:p></o:p></p>
              </div>
            </div>
            <p class="MsoNormal"> <o:p></o:p></p>
            <div>
              <div>
                <p class="MsoNormal">No, that won't work. We would still
                  run into lock inversion problems.<o:p></o:p></p>
                <div>
                  <p class="MsoNormal"> <o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">What we could do with the
                    scheduler is to turn submissions into dummies if we
                    find that the page tables are now outdated.<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"> <o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">But that would be really hacky
                    and I'm not sure if that would really work in all
                    cases.<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"> <o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">Christian.<o:p></o:p></p>
                </div>
              </div>
              <div>
                <p class="MsoNormal"> <o:p></o:p></p>
                <div>
                  <p class="MsoNormal">Am 27.09.2018 08:53 schrieb
                    "Kuehling, Felix" <<a
                      href="mailto:Felix.Kuehling@amd.com"
                      moz-do-not-send="true">Felix.Kuehling@amd.com</a>>:<o:p></o:p></p>
                </div>
              </div>
            </div>
            <div>
              <p class="MsoNormal">I had a chat with Jerome yesterday.
                He pointed out that the new blockable parameter can be
                used to infer whether the MMU notifier is being called 
                in a reclaim operation. So if blockable==true, it should
                even be safe to take the BO reservation lock without
                problems. I think with that we should be able to remove
                the read-write locking completely and go back to locking
                (or try-locking for blockable==false) the reservation
                locks in the MMU notifier?<br>
                <br>
                Regards,<br>
                  Felix<br>
                <br>
                -----Original Message-----<br>
                From: amd-gfx <<a
                  href="mailto:amd-gfx-bounces@lists.freedesktop.org"
                  moz-do-not-send="true">amd-gfx-bounces@lists.freedesktop.org</a>>
                On Behalf Of Christian König<br>
                Sent: Saturday, September 15, 2018 3:47 AM<br>
                To: Kuehling, Felix <<a
                  href="mailto:Felix.Kuehling@amd.com"
                  moz-do-not-send="true">Felix.Kuehling@amd.com</a>>;
                Yang, Philip <<a href="mailto:Philip.Yang@amd.com"
                  moz-do-not-send="true">Philip.Yang@amd.com</a>>;
                <a href="mailto:amd-gfx@lists.freedesktop.org"
                  moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                Jerome Glisse <<a href="mailto:j.glisse@gmail.com"
                  moz-do-not-send="true">j.glisse@gmail.com</a>><br>
                Subject: Re: [PATCH] drm/amdgpu: use HMM mirror callback
                to replace mmu notifier v4<br>
                <br>
                Am 14.09.2018 um 22:21 schrieb Felix Kuehling:<br>
                > On 2018-09-14 01:52 PM, Christian König wrote:<br>
                >> Am 14.09.2018 um 19:47 schrieb Philip Yang:<br>
                >>> On 2018-09-14 03:51 AM, Christian König
                wrote:<br>
                >>>> Am 13.09.2018 um 23:51 schrieb Felix
                Kuehling:<br>
                >>>>> On 2018-09-13 04:52 PM, Philip Yang
                wrote:<br>
                >>>>> [SNIP]<br>
                >>>>>>    +   
                amdgpu_mn_read_unlock(amn);<br>
                >>>>>> +<br>
                >>>>> amdgpu_mn_read_lock/unlock support
                recursive locking for multiple <br>
                >>>>> overlapping or nested invalidation
                ranges. But if you'r locking <br>
                >>>>> and unlocking in the same function.
                Is that still a concern?<br>
                >>> I don't understand the possible recursive
                case, but<br>
                >>> amdgpu_mn_read_lock() still support
                recursive locking.<br>
                >>>> Well the real problem is that unlocking
                them here won't work.<br>
                >>>><br>
                >>>> We need to hold the lock until we are
                sure that the operation which <br>
                >>>> updates the page tables is completed.<br>
                >>>><br>
                >>> The reason for this change is because hmm
                mirror has <br>
                >>> invalidate_start callback, no
                invalidate_end callback<br>
                >>><br>
                >>> Check mmu_notifier.c and hmm.c again, below
                is entire logic to <br>
                >>> update CPU page tables and callback:<br>
                >>><br>
                >>> mn lock amn->lock is used to protect
                interval tree access because <br>
                >>> user may submit/register new userptr
                anytime.<br>
                >>> This is same for old and new way.<br>
                >>><br>
                >>> step 2 guarantee the GPU operation is done
                before updating CPU page <br>
                >>> table.<br>
                >>><br>
                >>> So I think the change is safe. We don't
                need hold mn lock until the <br>
                >>> CPU page tables update is completed.<br>
                >> No, that isn't even remotely correct. The lock
                doesn't protects the <br>
                >> interval tree.<br>
                >><br>
                >>> Old:<br>
                >>>     1.
                down_read_non_owner(&amn->lock)<br>
                >>>     2. loop to handle BOs from node->bos
                through interval tree<br>
                >>> amn->object nodes<br>
                >>>         gfx: wait for pending BOs fence
                operation done, mark user <br>
                >>> pages dirty<br>
                >>>         kfd: evict user queues of the
                process, wait for queue <br>
                >>> unmap/map operation done<br>
                >>>     3. update CPU page tables<br>
                >>>     4. up_read(&amn->lock)<br>
                >>><br>
                >>> New, switch step 3 and 4<br>
                >>>     1.
                down_read_non_owner(&amn->lock)<br>
                >>>     2. loop to handle BOs from node->bos
                through interval tree<br>
                >>> amn->object nodes<br>
                >>>         gfx: wait for pending BOs fence
                operation done, mark user <br>
                >>> pages dirty<br>
                >>>         kfd: evict user queues of the
                process, wait for queue <br>
                >>> unmap/map operation done<br>
                >>>     3. up_read(&amn->lock)<br>
                >>>     4. update CPU page tables<br>
                >> The lock is there to make sure that we
                serialize page table updates <br>
                >> with command submission.<br>
                > As I understand it, the idea is to prevent command
                submission (adding <br>
                > new fences to BOs) while a page table invalidation
                is in progress.<br>
                <br>
                Yes, exactly.<br>
                <br>
                > But do we really need another lock for this?
                Wouldn't the <br>
                > re-validation of userptr BOs (currently calling
                get_user_pages) force <br>
                > synchronization with the ongoing page table
                invalidation through the <br>
                > mmap_sem or other MM locks?<br>
                <br>
                No and yes. We don't hold any other locks while doing
                command submission, but I expect that HMM has its own
                mechanism to prevent that.<br>
                <br>
                Since we don't modify
                amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly not
                using this mechanism correctly.<br>
                <br>
                Regards,<br>
                Christian.<br>
                _______________________________________________<br>
                amd-gfx mailing list<br>
                <a href="mailto:amd-gfx@lists.freedesktop.org"
                  moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
                <a
                  href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx"
                  moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><o:p></o:p></p>
            </div>
          </blockquote>
          <p class="MsoNormal"> <o:p></o:p></p>
        </blockquote>
        <p class="MsoNormal"><o:p> </o:p></p>
      </div>
    </blockquote>
    <br>
  </body>
</html>