<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 2021-04-07 6:28 a.m., Christian
      König wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:e37bdceb-cdb2-a826-21cf-8cb88748be08@gmail.com">
      
      Hi Andrey,<br>
      <br>
      <div class="moz-cite-prefix">Am 06.04.21 um 23:22 schrieb Andrey
        Grodzovsky:<br>
      </div>
      <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
        <p>Hey Christian, Denis, see bellow - <br>
        </p>
        <div class="moz-cite-prefix">On 2021-04-06 6:34 a.m., Christian
          König wrote:<br>
        </div>
        <blockquote type="cite" cite="mid:51d7873d-cf35-6be5-79c2-024937c67f6a@amd.com"> Hi
          Andrey,<br>
          <br>
          well good question. My job is to watch over the implementation
          and design and while I always help I can adjust anybodies
          schedule.<br>
          <br>
          Is the patch to print a warning when the hardware is accessed
          without holding the locks merged yet? If not then that would
          probably be a good starting point.<br>
        </blockquote>
        <p><br>
        </p>
        <p>It's merged into amd-staging-drm-next and since I work on
          drm-misc-next I will cherry-pick it into there.<br>
        </p>
      </blockquote>
      <br>
      Ok good to know, I haven't tracked that one further.<br>
      <br>
      <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
        <p> </p>
        <p><br>
        </p>
        <blockquote type="cite" cite="mid:51d7873d-cf35-6be5-79c2-024937c67f6a@amd.com"> <br>
          Then we would need to unify this with the SRCU to make sure
          that we have both the reset lock as well as block the hotplug
          code from reusing the MMIO space.<br>
        </blockquote>
        <br>
        <p>In my understanding there is a significant difference between
          handling of GPU reset and unplug - while GPU reset use case
          requires any HW accessing code to block and wait for the reset
          to finish and then proceed, hot-unplug<br>
          is permanent and hence no need to wait and proceed but rather
          abort at once.</p>
      </blockquote>
      <br>
      Yes, absolutely correct.<br>
      <br>
      <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
        <p> This why I think that in any place we already check for
          device reset we should also add a check for hot-unplug but the
          handling would be different<br>
          in that for hot-unplug we would abort instead of keep waiting.</p>
      </blockquote>
      <br>
      Yes, that's the rough picture in my head as well.<br>
      <br>
      Essentially Daniels patch of having an
      amdgpu_device_hwaccess_begin()/_end() was the right approach. You
      just can't do it in the top level IOCTL handler, but rather need
      it somewhere between front end and backend.<br>
    </blockquote>
    <p><br>
    </p>
    <p>Can you point me to what patch was it ? Can't find.</p>
    <p><br>
    </p>
    <blockquote type="cite" cite="mid:e37bdceb-cdb2-a826-21cf-8cb88748be08@gmail.com"> <br>
      <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
        <p>Similar to handling device reset for unplug we obviously also
          need to stop and block any MMIO accesses once device is
          unplugged and, as Daniel Vetter mentioned - we have to do it
          before finishing pci_remove (early device fini)<br>
          and not later (when last device reference is dropped from user
          space) in order to prevent reuse of MMIO space we still access
          by other hot plugging devices. As in device reset case we need
          to cancel all delay works, stop drm schedule, complete all
          unfinished fences(both HW and scheduler fences). While you
          stated strong objection to force signalling scheduler fences
          from GPU reset, quote: <br>
        </p>
        <p>"you can't signal the dma_fence waiting. Waiting for a
          dma_fence also means you wait for the GPU reset to finish.
          When we would signal the dma_fence during the GPU reset then
          we would run into memory corruption because the hardware jobs
          running after the GPU reset would access memory which is
          already freed."<br>
          To my understating this is a key difference with hot-unplug,
          the device is gone, all those concerns are irrelevant and
          hence we can actually force signal scheduler fences (setting
          and error to them before) to force completion of any<br>
          waiting clients such as possibly IOCTLs or async page flips
          e.t.c.<br>
        </p>
      </blockquote>
      <br>
      Yes, absolutely correct. That's what I also mentioned to Daniel.
      When we are able to nuke the device and any memory access it might
      do we can also signal the fences.<br>
      <br>
      <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
        <p> </p>
        <p>Beyond blocking all delayed works and scheduler threads we
          also need to guarantee no  IOCTL can access MMIO post device
          unplug OR in flight IOCTLs are done before we finish
          pci_remove (amdgpu_pci_remove for us).<br>
          For this I suggest we do something like what we worked on with
          Takashi Iwai the ALSA maintainer recently when he helped
          implementing PCI BARs move support for snd_hda_intel. Take a
          look at<br>
          <a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedesktop.org%2F~agrodzov%2Flinux%2Fcommit%2F%3Fh%3Dyadro%2Fpcie_hotplug%2Fmovable_bars_v9.1%26id%3Dcbaa324799718e2b828a8c7b5b001dd896748497&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cee01a18abd3b4c85742308d8f9aff67b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533881445109737%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=TkT2g6JYWW2Pp7RNURfloaksM2%2FAeYgYfxlsQRnPOCo%3D&reserved=0" originalsrc="https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=yadro/pcie_hotplug/movable_bars_v9.1&id=cbaa324799718e2b828a8c7b5b001dd896748497" shash="FHl7248HC/x31wHspNflTb+ftRfxcDYXP2Hw0HqnTdrHWaczzMXdYhA6jSindbHmzza4qGpaEw8rvXJiBkRY3BkuDLhJJ7JSbi4OmUwe/nOT88U5gIkcFiAhGQvVcrmr2Y/j45ubg62EyKL2hYurORVeqj+7ZVDa+P+utpQYQz8=" moz-do-not-send="true">https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=yadro/pcie_hotplug/movable_bars_v9.1&id=cbaa324799718e2b828a8c7b5b001dd896748497</a>
          and<br>
          <a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedesktop.org%2F~agrodzov%2Flinux%2Fcommit%2F%3Fh%3Dyadro%2Fpcie_hotplug%2Fmovable_bars_v9.1%26id%3De36365d9ab5bbc30bdc221ab4b3437de34492440&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cee01a18abd3b4c85742308d8f9aff67b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533881445119727%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=e%2BuFuY7SEOTngYU%2BnFzW%2BvhEHqdbfliyx%2Fdx%2FhkoIKY%3D&reserved=0" originalsrc="https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=yadro/pcie_hotplug/movable_bars_v9.1&id=e36365d9ab5bbc30bdc221ab4b3437de34492440" shash="uoyaNhinHvE/xWTKNnbny/XoEWda4ENoJEbtErzTv4bEk8x8Dc1J6uMLoBwf6Q4QSevoWQGTQcWJ+AZe/a69lQLeXKNQHz0Ul1t7kZbSEQn91e6a/ULK4Fr4BFwiYdJsxMTFoWRIJpkfGdx7s5/CoH/re504Yue7no8hOpduxpo=" moz-do-not-send="true">https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=yadro/pcie_hotplug/movable_bars_v9.1&id=e36365d9ab5bbc30bdc221ab4b3437de34492440</a><br>
          We also had same issue there, how to prevent MMIO accesses
          while the BARs are migrating. What was done there is a
          refcount was added to count all IOCTLs in flight, for any in
          flight  IOCTL the BAR migration handler would <br>
          block for the refcount to drop to 0 before it would proceed,
          for any later IOCTL it stops and wait if device is in
          migration state. We even don't need the wait part, nothing to
          wait for, we just return with -ENODEV for this case.</p>
      </blockquote>
      <br>
      This is essentially what the DRM SRCU is doing as well.<br>
    </blockquote>
    <blockquote type="cite" cite="mid:e37bdceb-cdb2-a826-21cf-8cb88748be08@gmail.com"> <br>
      For the hotplug case we could do this in the toplevel since we can
      signal the fence and don't need to block memory management.<br>
    </blockquote>
    <p><br>
    </p>
    <p>To make SRCU 'wait for' all IOCTLs in flight we would need to
      wrap every IOCTL ( practically - just drm_ioctl function) with
      drm_dev_enter/drm_dev_exit - can we do it ? <br>
    </p>
    <p><br>
    </p>
    <blockquote type="cite" cite="mid:e37bdceb-cdb2-a826-21cf-8cb88748be08@gmail.com"> <br>
      But I'm not sure, maybe we should handle it the same way as reset
      or maybe we should have it at the top level.<br>
    </blockquote>
    <p><br>
    </p>
    <p>If by top level you mean checking for device unplugged and
      bailing out at the entry to IOCTL or right at start of any
      work_item/timer function we have then seems to me it's better and
      more clear. Once we flushed all of them in flight there is no
      reason for them to execute any more when device is unplugged.<br>
    </p>
    <p>Andrey<br>
    </p>
    <p> <br>
    </p>
    <blockquote type="cite" cite="mid:e37bdceb-cdb2-a826-21cf-8cb88748be08@gmail.com"> <br>
      Regards,<br>
      Christian.<br>
      <br>
      <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
        <p>The above approach should allow us to wait for all the IOCTLs
          in flight, together with stopping scheduler threads and
          cancelling and flushing all in flight work items and timers i
          think It should give as full solution for the hot-unplug case<br>
          of preventing any MMIO accesses post device pci_remove.<br>
          <br>
          Let me know what you think guys.</p>
        <p>Andrey</p>
        <p><br>
        </p>
        <blockquote type="cite" cite="mid:51d7873d-cf35-6be5-79c2-024937c67f6a@amd.com"> <br>
          And then testing, testing, testing to see if we have missed
          something.<br>
          <br>
          Christian.<br>
          <br>
          <div class="moz-cite-prefix">Am 05.04.21 um 19:58 schrieb
            Andrey Grodzovsky:<br>
          </div>
          <blockquote type="cite" cite="mid:1e37bb4d-f54d-1b7e-4632-94cfcf749528@amd.com">
            <p>Denis, Christian, are there any updates in the plan on
              how to move on with this ? As you know I need very similar
              code for my up-streaming of device hot-unplug. My latest
              solution (<a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Farchives%2Famd-gfx%2F2021-January%2F058606.html&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cee01a18abd3b4c85742308d8f9aff67b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533881445129722%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=sMISjE%2FOY0KjAY5rOXXy4KoX%2BeKfhROqwNpH0eEbu7k%3D&reserved=0" originalsrc="https://lists.freedesktop.org/archives/amd-gfx/2021-January/058606.html" shash="ZJFWIZXzcZKY5pN5tv3TCwIIGHY9s30XRsxyyPhOBtRC9llxWPUTCUgnV/mo2HWvAAhcyZaTcUeq+K/3UsQitNwkykgRKMT6Gw0obU9KcVf/m2+Q8INspLNAZklpcIzMoaO5qvfUqCFXpaM6e0tBjlq89+ob+YpSDJhtFoqNsPg=" moz-do-not-send="true">https://lists.freedesktop.org/archives/amd-gfx/2021-January/058606.html</a>)
              was not acceptable because of low level guards on the
              register accessors level which was hurting performance.
              Basically I need a way to prevent any MMIO write accesses
              from kernel driver after device is removed (UMD accesses
              are taken care of by page faulting dummy page). We are
              using now hot-unplug code for Freemont program and so
              up-streaming became more of a priority then before. This
              MMIO access issue is currently my main blocker from
              up-streaming. Is there any way I can assist in pushing
              this on ?</p>
            <p>Andrey  <br>
            </p>
            <div class="moz-cite-prefix">On 2021-03-18 5:51 a.m.,
              Christian König wrote:<br>
            </div>
            <blockquote type="cite" cite="mid:378fdffb-99b5-2a14-736d-a06f310b040c@amd.com">
              Am 18.03.21 um 10:30 schrieb Li, Dennis:<br>
              <blockquote type="cite" cite="mid:DM5PR12MB253379E8C89D8A20C8A0245AED699@DM5PR12MB2533.namprd12.prod.outlook.com">
                <meta name="Generator" content="Microsoft Word 15
                  (filtered medium)">
                <!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]-->
                <style>@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
        {font-family:\7B49 \7EBF ;
        panose-1:2 1 6 0 3 1 1 1 1 1;}@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}@font-face
        {font-family:"\@\7B49 \7EBF ";
        panose-1:2 1 6 0 3 1 1 1 1 1;}@font-face
        {font-family:"Segoe UI";
        panose-1:2 11 5 2 4 2 4 2 2 3;}p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:windowtext;}.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}div.WordSection1
        {page:WordSection1;}</style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
                <div class="WordSection1">
                  <p class="MsoNormal">>>> The GPU reset
                    doesn't complete the fences we wait for. It only
                    completes the hardware fences as part of the reset.<o:p></o:p></p>
                  <p class="MsoNormal">>>> So waiting for a
                    fence while holding the reset lock is illegal and
                    needs to be avoided.<o:p></o:p></p>
                  <p class="MsoNormal">I understood your concern. It is
                    more complex for DRM GFX, therefore I abandon adding
                    lock protection for DRM ioctls now. Maybe we can try
                    to add all kernel  dma_fence waiting in a list, and
                    signal all in recovery threads. Do you have same
                    concern for compute cases? </p>
                </div>
              </blockquote>
              <br>
              Yes, compute (KFD) is even harder to handle.<br>
              <br>
              See you can't signal the dma_fence waiting. Waiting for a
              dma_fence also means you wait for the GPU reset to finish.<br>
              <br>
              When we would signal the dma_fence during the GPU reset
              then we would run into memory corruption because the
              hardware jobs running after the GPU reset would access
              memory which is already freed.<br>
              <br>
              <blockquote type="cite" cite="mid:DM5PR12MB253379E8C89D8A20C8A0245AED699@DM5PR12MB2533.namprd12.prod.outlook.com">
                <div class="WordSection1">
                  <p class="MsoNormal"><o:p></o:p></p>
                  <p class="MsoNormal"><o:p> </o:p></p>
                  <p class="MsoNormal">>>> Lockdep also
                    complains about this when it is used correctly. The
                    only reason it doesn't complain here is because you
                    use an atomic+wait_event instead of a locking
                    primitive.<o:p></o:p></p>
                  <p class="MsoNormal">Agree. This approach will escape
                    the monitor of lockdep.  Its goal is to block other
                    threads when GPU recovery thread start. But I
                    couldn’t find a better method to solve this problem.
                    Do you have some suggestion? </p>
                </div>
              </blockquote>
              <br>
              Well, completely abandon those change here.<br>
              <br>
              What we need to do is to identify where hardware access
              happens and then insert taking the read side of the GPU
              reset lock so that we don't wait for a dma_fence or
              allocate memory, but still protect the hardware from
              concurrent access and reset.<br>
              <br>
              Regards,<br>
              Christian.<br>
              <br>
              <blockquote type="cite" cite="mid:DM5PR12MB253379E8C89D8A20C8A0245AED699@DM5PR12MB2533.namprd12.prod.outlook.com">
                <div class="WordSection1">
                  <p class="MsoNormal"><o:p></o:p></p>
                  <p class="MsoNormal"><o:p> </o:p></p>
                  <p class="MsoNormal">Best Regards<o:p></o:p></p>
                  <p class="MsoNormal">Dennis Li<o:p></o:p></p>
                  <p class="MsoNormal"><o:p> </o:p></p>
                  <div>
                    <div style="border:none;border-top:solid #E1E1E1
                      1.0pt;padding:3.0pt 0in 0in 0in">
                      <p class="MsoNormal"><b>From:</b> Koenig,
                        Christian <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com" moz-do-not-send="true"><Christian.Koenig@amd.com></a>
                        <br>
                        <b>Sent:</b> Thursday, March 18, 2021 4:59 PM<br>
                        <b>To:</b> Li, Dennis <a class="moz-txt-link-rfc2396E" href="mailto:Dennis.Li@amd.com" moz-do-not-send="true"><Dennis.Li@amd.com></a>;
                        <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                        Deucher, Alexander <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com" moz-do-not-send="true"><Alexander.Deucher@amd.com></a>;
                        Kuehling, Felix <a class="moz-txt-link-rfc2396E" href="mailto:Felix.Kuehling@amd.com" moz-do-not-send="true"><Felix.Kuehling@amd.com></a>;
                        Zhang, Hawking <a class="moz-txt-link-rfc2396E" href="mailto:Hawking.Zhang@amd.com" moz-do-not-send="true"><Hawking.Zhang@amd.com></a><br>
                        <b>Subject:</b> AW: [PATCH 0/4] Refine GPU
                        recovery sequence to enhance its stability<o:p></o:p></p>
                    </div>
                  </div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                  <div>
                    <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                        UI",sans-serif;color:black">Exactly that's
                        what you don't seem to understand.<o:p></o:p></span></p>
                  </div>
                  <div>
                    <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                        UI",sans-serif;color:black"><o:p> </o:p></span></p>
                  </div>
                  <div>
                    <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                        UI",sans-serif;color:black">The GPU reset
                        doesn't complete the fences we wait for. It only
                        completes the hardware fences as part of the
                        reset.<o:p></o:p></span></p>
                  </div>
                  <div>
                    <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                        UI",sans-serif;color:black"><o:p> </o:p></span></p>
                  </div>
                  <div>
                    <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                        UI",sans-serif;color:black">So waiting for
                        a fence while holding the reset lock is illegal
                        and needs to be avoided.<o:p></o:p></span></p>
                  </div>
                  <div>
                    <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                        UI",sans-serif;color:black"><o:p> </o:p></span></p>
                  </div>
                  <div>
                    <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                        UI",sans-serif;color:black">Lockdep also
                        complains about this when it is used correctly.
                        The only reason it doesn't complain here is
                        because you use an atomic+wait_event instead of
                        a locking primitive.<o:p></o:p></span></p>
                  </div>
                  <div>
                    <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                        UI",sans-serif;color:black"><o:p> </o:p></span></p>
                  </div>
                  <div>
                    <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                        UI",sans-serif;color:black">Regards,<o:p></o:p></span></p>
                  </div>
                  <div>
                    <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                        UI",sans-serif;color:black">Christian.<o:p></o:p></span></p>
                  </div>
                  <div>
                    <p class="MsoNormal"><o:p> </o:p></p>
                  </div>
                  <div class="MsoNormal" style="text-align:center" align="center">
                    <hr width="98%" size="2" align="center"> </div>
                  <div id="divRplyFwdMsg">
                    <p class="MsoNormal"><b><span style="color:black">Von:</span></b><span style="color:black"> Li, Dennis <<a href="mailto:Dennis.Li@amd.com" moz-do-not-send="true">Dennis.Li@amd.com</a>><br>
                        <b>Gesendet:</b> Donnerstag, 18. März 2021 09:28<br>
                        <b>An:</b> Koenig, Christian <<a href="mailto:Christian.Koenig@amd.com" moz-do-not-send="true">Christian.Koenig@amd.com</a>>;
                        <a href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
                        <<a href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>>;
                        Deucher, Alexander <<a href="mailto:Alexander.Deucher@amd.com" moz-do-not-send="true">Alexander.Deucher@amd.com</a>>;
                        Kuehling, Felix <<a href="mailto:Felix.Kuehling@amd.com" moz-do-not-send="true">Felix.Kuehling@amd.com</a>>;
                        Zhang, Hawking <<a href="mailto:Hawking.Zhang@amd.com" moz-do-not-send="true">Hawking.Zhang@amd.com</a>><br>
                        <b>Betreff:</b> RE: [PATCH 0/4] Refine GPU
                        recovery sequence to enhance its stability</span>
                      <o:p></o:p></p>
                    <div>
                      <p class="MsoNormal"> <o:p></o:p></p>
                    </div>
                  </div>
                  <div>
                    <div>
                      <p class="MsoNormal">>>> Those two steps
                        need to be exchanged or otherwise it is possible
                        that new delayed work items etc are started
                        before the lock is taken.<br>
                        What about adding check for
                        adev->in_gpu_reset in work item? If exchange
                        the two steps, it maybe introduce the deadlock. 
                        For example, the user thread hold the read lock
                        and waiting for the fence, if recovery thread
                        try to hold write lock and then complete fences,
                        in this case, recovery thread will always be
                        blocked.<o:p></o:p></p>
                    </div>
                    <div>
                      <p class="MsoNormal" style="margin-bottom:12.0pt"><br>
                        Best Regards<br>
                        Dennis Li<br>
                        -----Original Message-----<br>
                        From: Koenig, Christian <<a href="mailto:Christian.Koenig@amd.com" moz-do-not-send="true">Christian.Koenig@amd.com</a>>
                        <br>
                        Sent: Thursday, March 18, 2021 3:54 PM<br>
                        To: Li, Dennis <<a href="mailto:Dennis.Li@amd.com" moz-do-not-send="true">Dennis.Li@amd.com</a>>;
                        <a href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">
                          amd-gfx@lists.freedesktop.org</a>; Deucher,
                        Alexander <<a href="mailto:Alexander.Deucher@amd.com" moz-do-not-send="true">Alexander.Deucher@amd.com</a>>;
                        Kuehling, Felix <<a href="mailto:Felix.Kuehling@amd.com" moz-do-not-send="true">Felix.Kuehling@amd.com</a>>;
                        Zhang, Hawking <<a href="mailto:Hawking.Zhang@amd.com" moz-do-not-send="true">Hawking.Zhang@amd.com</a>><br>
                        Subject: Re: [PATCH 0/4] Refine GPU recovery
                        sequence to enhance its stability<br>
                        <br>
                        Am 18.03.21 um 08:23 schrieb Dennis Li:<br>
                        > We have defined two variables in_gpu_reset
                        and reset_sem in adev object. The atomic type
                        variable in_gpu_reset is used to avoid recovery
                        thread reenter and make lower functions return
                        more earlier when recovery start, but couldn't
                        block recovery thread when it access hardware.
                        The r/w semaphore reset_sem is used to solve
                        these synchronization issues between recovery
                        thread and other threads.<br>
                        ><br>
                        > The original solution locked registers'
                        access in lower functions, which will introduce
                        following issues:<br>
                        ><br>
                        > 1) many lower functions are used in both
                        recovery thread and others. Firstly we must
                        harvest these functions, it is easy to miss
                        someones. Secondly these functions need select
                        which lock (read lock or write lock) will be
                        used, according to the thread it is running in.
                        If the thread context isn't considered, the
                        added lock will easily introduce deadlock.
                        Besides that, in most time, developer easily
                        forget to add locks for new functions.<br>
                        ><br>
                        > 2) performance drop. More lower functions
                        are more frequently called.<br>
                        ><br>
                        > 3) easily introduce false positive lockdep
                        complaint, because write lock has big range in
                        recovery thread, but low level functions will
                        hold read lock may be protected by other locks
                        in other threads.<br>
                        ><br>
                        > Therefore the new solution will try to add
                        lock protection for ioctls of kfd. Its goal is
                        that there are no threads except for recovery
                        thread or its children (for xgmi) to access
                        hardware when doing GPU reset and resume. So
                        refine recovery thread as the following:<br>
                        ><br>
                        > Step 0:
                        atomic_cmpxchg(&adev->in_gpu_reset, 0, 1)<br>
                        >     1). if failed, it means system had a
                        recovery thread running, current thread exit
                        directly;<br>
                        >     2). if success, enter recovery thread;<br>
                        ><br>
                        > Step 1: cancel all delay works, stop drm
                        schedule, complete all unreceived fences and so
                        on. It try to stop or pause other threads.<br>
                        ><br>
                        > Step 2: call
                        down_write(&adev->reset_sem) to hold
                        write lock, which will block recovery thread
                        until other threads release read locks.<br>
                        <br>
                        Those two steps need to be exchanged or
                        otherwise it is possible that new delayed work
                        items etc are started before the lock is taken.<br>
                        <br>
                        Just to make it clear until this is fixed the
                        whole patch set is a NAK.<br>
                        <br>
                        Regards,<br>
                        Christian.<br>
                        <br>
                        ><br>
                        > Step 3: normally, there is only recovery
                        threads running to access hardware, it is safe
                        to do gpu reset now.<br>
                        ><br>
                        > Step 4: do post gpu reset, such as call all
                        ips' resume functions;<br>
                        ><br>
                        > Step 5: atomic set adev->in_gpu_reset as
                        0, wake up other threads and release write lock.
                        Recovery thread exit normally.<br>
                        ><br>
                        > Other threads call the amdgpu_read_lock to
                        synchronize with recovery thread. If it finds
                        that in_gpu_reset is 1, it should release read
                        lock if it has holden one, and then blocks
                        itself to wait for recovery finished event. If
                        thread successfully hold read lock and
                        in_gpu_reset is 0, it continues. It will exit
                        normally or be stopped by recovery thread in
                        step 1.<br>
                        ><br>
                        > Dennis Li (4):<br>
                        >    drm/amdgpu: remove reset lock from low
                        level functions<br>
                        >    drm/amdgpu: refine the GPU recovery
                        sequence<br>
                        >    drm/amdgpu: instead of using
                        down/up_read directly<br>
                        >    drm/amdkfd: add reset lock protection
                        for kfd entry functions<br>
                        ><br>
                        >  
                        drivers/gpu/drm/amd/amdgpu/amdgpu.h          
                        |   6 +<br>
                        >  
                        drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   | 
                        14 +-<br>
                        >  
                        drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |
                        173 +++++++++++++-----<br>
                        >  
                        .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c   
                        |   8 -<br>
                        >  
                        drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c       
                        |   4 +-<br>
                        >  
                        drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c        
                        |   9 +-<br>
                        >  
                        drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c        
                        |   5 +-<br>
                        >  
                        drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c        
                        |   5 +-<br>
                        >  
                        drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      |
                        172 ++++++++++++++++-<br>
                        >  
                        drivers/gpu/drm/amd/amdkfd/kfd_priv.h        
                        |   3 +-<br>
                        >  
                        drivers/gpu/drm/amd/amdkfd/kfd_process.c     
                        |   4 +<br>
                        >  
                        .../amd/amdkfd/kfd_process_queue_manager.c    | 
                        17 ++<br>
                        >   12 files changed, 345 insertions(+), 75
                        deletions(-)<br>
                        ><o:p></o:p></p>
                    </div>
                  </div>
                </div>
              </blockquote>
              <br>
              <br>
              <fieldset class="mimeAttachmentHeader"></fieldset>
              <pre class="moz-quote-pre" wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cee01a18abd3b4c85742308d8f9aff67b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533881445129722%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=gGk00Js7gf1eD5flMUiJuP8Op4O1KjnqplS5X%2B2BpXw%3D&reserved=0" originalsrc="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" shash="e19VDPSJDjl0CDWElWDBmXxl+V+E2vWuyBZTe1sSoHQwveKkSV+1dhpI+CKarwJBVrsKqPUyaAw1S2AapBmfOwNI4lf4zVpldsQJgi0olKBvBNo9tXin2NxO01ANSC5HRnOpOazmGJKcRI2Jm9HZkSU1aWqd2qKTp3LoTzqUqUw=" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
            </blockquote>
          </blockquote>
          <br>
        </blockquote>
        <br>
        <fieldset class="mimeAttachmentHeader"></fieldset>
        <pre class="moz-quote-pre" wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cee01a18abd3b4c85742308d8f9aff67b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533881445139715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=nqM8PNaMcfzPwfpoO%2FBf7rwIotKB5yMDqOxSaUOqF0I%3D&reserved=0" originalsrc="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" shash="hI23A+rK2YrPnY4kwR78CmtZO+0+01791iF+WCcrK9n20wilupxopMrSWZ0LQogmWFmpq1HoPjT4iPmWRM/LMgXNNdXLIL2NO74Rd/OmsX0z5tg+C6WYl4T7FkfXi+TK37th72gGbSLCQ4Y+2fIjlenHyOju4x32EkpdvYh0Nm8=" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
      </blockquote>
      <br>
    </blockquote>
  </body>
</html>