<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    Hi Andrey,<br>
    <br>
    <div class="moz-cite-prefix">Am 07.04.21 um 21:44 schrieb Andrey
      Grodzovsky:<br>
    </div>
    <blockquote type="cite" cite="mid:1e4b766d-b5c2-e6b5-8be6-9b2fae3abc94@amd.com">
      
      <p><br>
      </p>
      <div class="moz-cite-prefix">On 2021-04-07 6:28 a.m., Christian
        König wrote:<br>
      </div>
      <blockquote type="cite" cite="mid:e37bdceb-cdb2-a826-21cf-8cb88748be08@gmail.com"> Hi
        Andrey,<br>
        <br>
        <div class="moz-cite-prefix">Am 06.04.21 um 23:22 schrieb Andrey
          Grodzovsky:<br>
        </div>
        <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
          <p>Hey Christian, Denis, see bellow - <br>
          </p>
          <div class="moz-cite-prefix">On 2021-04-06 6:34 a.m.,
            Christian König wrote:<br>
          </div>
          <blockquote type="cite" cite="mid:51d7873d-cf35-6be5-79c2-024937c67f6a@amd.com"> Hi
            Andrey,<br>
            <br>
            well good question. My job is to watch over the
            implementation and design and while I always help I can
            adjust anybodies schedule.<br>
            <br>
            Is the patch to print a warning when the hardware is
            accessed without holding the locks merged yet? If not then
            that would probably be a good starting point.<br>
          </blockquote>
          <p><br>
          </p>
          <p>It's merged into amd-staging-drm-next and since I work on
            drm-misc-next I will cherry-pick it into there.<br>
          </p>
        </blockquote>
        <br>
        Ok good to know, I haven't tracked that one further.<br>
        <br>
        <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
          <p> </p>
          <p><br>
          </p>
          <blockquote type="cite" cite="mid:51d7873d-cf35-6be5-79c2-024937c67f6a@amd.com"> <br>
            Then we would need to unify this with the SRCU to make sure
            that we have both the reset lock as well as block the
            hotplug code from reusing the MMIO space.<br>
          </blockquote>
          <br>
          <p>In my understanding there is a significant difference
            between handling of GPU reset and unplug - while GPU reset
            use case requires any HW accessing code to block and wait
            for the reset to finish and then proceed, hot-unplug<br>
            is permanent and hence no need to wait and proceed but
            rather abort at once.</p>
        </blockquote>
        <br>
        Yes, absolutely correct.<br>
        <br>
        <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
          <p> This why I think that in any place we already check for
            device reset we should also add a check for hot-unplug but
            the handling would be different<br>
            in that for hot-unplug we would abort instead of keep
            waiting.</p>
        </blockquote>
        <br>
        Yes, that's the rough picture in my head as well.<br>
        <br>
        Essentially Daniels patch of having an
        amdgpu_device_hwaccess_begin()/_end() was the right approach.
        You just can't do it in the top level IOCTL handler, but rather
        need it somewhere between front end and backend.<br>
      </blockquote>
      <p><br>
      </p>
      <p>Can you point me to what patch was it ? Can't find.</p>
    </blockquote>
    <br>
    What I mean was the approach in patch #3 in this series where he
    replaced the down_read/up_read with
    amdgpu_read_lock()/amdgpu_read_unlock().<br>
    <br>
    I would just not call it amdgpu_read_lock()/amdgpu_read_unlock(),
    but something more descriptive.<br>
    <br>
    Regards,<br>
    Christian.<br>
    <br>
    <blockquote type="cite" cite="mid:1e4b766d-b5c2-e6b5-8be6-9b2fae3abc94@amd.com">
      <p><br>
      </p>
      <blockquote type="cite" cite="mid:e37bdceb-cdb2-a826-21cf-8cb88748be08@gmail.com"> <br>
        <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
          <p>Similar to handling device reset for unplug we obviously
            also need to stop and block any MMIO accesses once device is
            unplugged and, as Daniel Vetter mentioned - we have to do it
            before finishing pci_remove (early device fini)<br>
            and not later (when last device reference is dropped from
            user space) in order to prevent reuse of MMIO space we still
            access by other hot plugging devices. As in device reset
            case we need to cancel all delay works, stop drm schedule,
            complete all unfinished fences(both HW and scheduler
            fences). While you stated strong objection to force
            signalling scheduler fences from GPU reset, quote: <br>
          </p>
          <p>"you can't signal the dma_fence waiting. Waiting for a
            dma_fence also means you wait for the GPU reset to finish.
            When we would signal the dma_fence during the GPU reset then
            we would run into memory corruption because the hardware
            jobs running after the GPU reset would access memory which
            is already freed."<br>
            To my understating this is a key difference with hot-unplug,
            the device is gone, all those concerns are irrelevant and
            hence we can actually force signal scheduler fences (setting
            and error to them before) to force completion of any<br>
            waiting clients such as possibly IOCTLs or async page flips
            e.t.c.<br>
          </p>
        </blockquote>
        <br>
        Yes, absolutely correct. That's what I also mentioned to Daniel.
        When we are able to nuke the device and any memory access it
        might do we can also signal the fences.<br>
        <br>
        <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
          <p> </p>
          <p>Beyond blocking all delayed works and scheduler threads we
            also need to guarantee no  IOCTL can access MMIO post device
            unplug OR in flight IOCTLs are done before we finish
            pci_remove (amdgpu_pci_remove for us).<br>
            For this I suggest we do something like what we worked on
            with Takashi Iwai the ALSA maintainer recently when he
            helped implementing PCI BARs move support for snd_hda_intel.
            Take a look at<br>
            <a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedesktop.org%2F~agrodzov%2Flinux%2Fcommit%2F%3Fh%3Dyadro%2Fpcie_hotplug%2Fmovable_bars_v9.1%26id%3Dcbaa324799718e2b828a8c7b5b001dd896748497&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cee01a18abd3b4c85742308d8f9aff67b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533881445109737%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=TkT2g6JYWW2Pp7RNURfloaksM2%2FAeYgYfxlsQRnPOCo%3D&reserved=0" originalsrc="https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=yadro/pcie_hotplug/movable_bars_v9.1&id=cbaa324799718e2b828a8c7b5b001dd896748497" shash="FHl7248HC/x31wHspNflTb+ftRfxcDYXP2Hw0HqnTdrHWaczzMXdYhA6jSindbHmzza4qGpaEw8rvXJiBkRY3BkuDLhJJ7JSbi4OmUwe/nOT88U5gIkcFiAhGQvVcrmr2Y/j45ubg62EyKL2hYurORVeqj+7ZVDa+P+utpQYQz8=" moz-do-not-send="true">https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=yadro/pcie_hotplug/movable_bars_v9.1&id=cbaa324799718e2b828a8c7b5b001dd896748497</a>
            and<br>
            <a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedesktop.org%2F~agrodzov%2Flinux%2Fcommit%2F%3Fh%3Dyadro%2Fpcie_hotplug%2Fmovable_bars_v9.1%26id%3De36365d9ab5bbc30bdc221ab4b3437de34492440&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cee01a18abd3b4c85742308d8f9aff67b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533881445119727%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=e%2BuFuY7SEOTngYU%2BnFzW%2BvhEHqdbfliyx%2Fdx%2FhkoIKY%3D&reserved=0" originalsrc="https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=yadro/pcie_hotplug/movable_bars_v9.1&id=e36365d9ab5bbc30bdc221ab4b3437de34492440" shash="uoyaNhinHvE/xWTKNnbny/XoEWda4ENoJEbtErzTv4bEk8x8Dc1J6uMLoBwf6Q4QSevoWQGTQcWJ+AZe/a69lQLeXKNQHz0Ul1t7kZbSEQn91e6a/ULK4Fr4BFwiYdJsxMTFoWRIJpkfGdx7s5/CoH/re504Yue7no8hOpduxpo=" moz-do-not-send="true">https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=yadro/pcie_hotplug/movable_bars_v9.1&id=e36365d9ab5bbc30bdc221ab4b3437de34492440</a><br>
            We also had same issue there, how to prevent MMIO accesses
            while the BARs are migrating. What was done there is a
            refcount was added to count all IOCTLs in flight, for any in
            flight  IOCTL the BAR migration handler would <br>
            block for the refcount to drop to 0 before it would proceed,
            for any later IOCTL it stops and wait if device is in
            migration state. We even don't need the wait part, nothing
            to wait for, we just return with -ENODEV for this case.</p>
        </blockquote>
        <br>
        This is essentially what the DRM SRCU is doing as well.<br>
      </blockquote>
      <blockquote type="cite" cite="mid:e37bdceb-cdb2-a826-21cf-8cb88748be08@gmail.com"> <br>
        For the hotplug case we could do this in the toplevel since we
        can signal the fence and don't need to block memory management.<br>
      </blockquote>
      <p><br>
      </p>
      <p>To make SRCU 'wait for' all IOCTLs in flight we would need to
        wrap every IOCTL ( practically - just drm_ioctl function) with
        drm_dev_enter/drm_dev_exit - can we do it ? <br>
      </p>
      <p><br>
      </p>
      <blockquote type="cite" cite="mid:e37bdceb-cdb2-a826-21cf-8cb88748be08@gmail.com"> <br>
        But I'm not sure, maybe we should handle it the same way as
        reset or maybe we should have it at the top level.<br>
      </blockquote>
      <p><br>
      </p>
      <p>If by top level you mean checking for device unplugged and
        bailing out at the entry to IOCTL or right at start of any
        work_item/timer function we have then seems to me it's better
        and more clear. Once we flushed all of them in flight there is
        no reason for them to execute any more when device is unplugged.<br>
      </p>
      <p>Andrey<br>
      </p>
      <p> <br>
      </p>
      <blockquote type="cite" cite="mid:e37bdceb-cdb2-a826-21cf-8cb88748be08@gmail.com"> <br>
        Regards,<br>
        Christian.<br>
        <br>
        <blockquote type="cite" cite="mid:29ffe63b-9049-824f-84fc-a92fdb451e0d@amd.com">
          <p>The above approach should allow us to wait for all the
            IOCTLs in flight, together with stopping scheduler threads
            and cancelling and flushing all in flight work items and
            timers i think It should give as full solution for the
            hot-unplug case<br>
            of preventing any MMIO accesses post device pci_remove.<br>
            <br>
            Let me know what you think guys.</p>
          <p>Andrey</p>
          <p><br>
          </p>
          <blockquote type="cite" cite="mid:51d7873d-cf35-6be5-79c2-024937c67f6a@amd.com"> <br>
            And then testing, testing, testing to see if we have missed
            something.<br>
            <br>
            Christian.<br>
            <br>
            <div class="moz-cite-prefix">Am 05.04.21 um 19:58 schrieb
              Andrey Grodzovsky:<br>
            </div>
            <blockquote type="cite" cite="mid:1e37bb4d-f54d-1b7e-4632-94cfcf749528@amd.com">
              <p>Denis, Christian, are there any updates in the plan on
                how to move on with this ? As you know I need very
                similar code for my up-streaming of device hot-unplug.
                My latest solution (<a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Farchives%2Famd-gfx%2F2021-January%2F058606.html&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cee01a18abd3b4c85742308d8f9aff67b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533881445129722%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=sMISjE%2FOY0KjAY5rOXXy4KoX%2BeKfhROqwNpH0eEbu7k%3D&reserved=0" originalsrc="https://lists.freedesktop.org/archives/amd-gfx/2021-January/058606.html" shash="ZJFWIZXzcZKY5pN5tv3TCwIIGHY9s30XRsxyyPhOBtRC9llxWPUTCUgnV/mo2HWvAAhcyZaTcUeq+K/3UsQitNwkykgRKMT6Gw0obU9KcVf/m2+Q8INspLNAZklpcIzMoaO5qvfUqCFXpaM6e0tBjlq89+ob+YpSDJhtFoqNsPg=" moz-do-not-send="true">https://lists.freedesktop.org/archives/amd-gfx/2021-January/058606.html</a>)
                was not acceptable because of low level guards on the
                register accessors level which was hurting performance.
                Basically I need a way to prevent any MMIO write
                accesses from kernel driver after device is removed (UMD
                accesses are taken care of by page faulting dummy page).
                We are using now hot-unplug code for Freemont program
                and so up-streaming became more of a priority then
                before. This MMIO access issue is currently my main
                blocker from up-streaming. Is there any way I can assist
                in pushing this on ?</p>
              <p>Andrey  <br>
              </p>
              <div class="moz-cite-prefix">On 2021-03-18 5:51 a.m.,
                Christian König wrote:<br>
              </div>
              <blockquote type="cite" cite="mid:378fdffb-99b5-2a14-736d-a06f310b040c@amd.com">
                Am 18.03.21 um 10:30 schrieb Li, Dennis:<br>
                <blockquote type="cite" cite="mid:DM5PR12MB253379E8C89D8A20C8A0245AED699@DM5PR12MB2533.namprd12.prod.outlook.com">
                  <meta name="Generator" content="Microsoft Word 15
                    (filtered medium)">
                  <!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]-->
                  <style>@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
        {font-family:\7B49 \7EBF ;
        panose-1:2 1 6 0 3 1 1 1 1 1;}@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}@font-face
        {font-family:"\@\7B49 \7EBF ";
        panose-1:2 1 6 0 3 1 1 1 1 1;}@font-face
        {font-family:"Segoe UI";
        panose-1:2 11 5 2 4 2 4 2 2 3;}p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:windowtext;}.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}div.WordSection1
        {page:WordSection1;}</style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
                  <div class="WordSection1">
                    <p class="MsoNormal">>>> The GPU reset
                      doesn't complete the fences we wait for. It only
                      completes the hardware fences as part of the
                      reset.<o:p></o:p></p>
                    <p class="MsoNormal">>>> So waiting for a
                      fence while holding the reset lock is illegal and
                      needs to be avoided.<o:p></o:p></p>
                    <p class="MsoNormal">I understood your concern. It
                      is more complex for DRM GFX, therefore I abandon
                      adding lock protection for DRM ioctls now. Maybe
                      we can try to add all kernel  dma_fence waiting in
                      a list, and signal all in recovery threads. Do you
                      have same concern for compute cases? </p>
                  </div>
                </blockquote>
                <br>
                Yes, compute (KFD) is even harder to handle.<br>
                <br>
                See you can't signal the dma_fence waiting. Waiting for
                a dma_fence also means you wait for the GPU reset to
                finish.<br>
                <br>
                When we would signal the dma_fence during the GPU reset
                then we would run into memory corruption because the
                hardware jobs running after the GPU reset would access
                memory which is already freed.<br>
                <br>
                <blockquote type="cite" cite="mid:DM5PR12MB253379E8C89D8A20C8A0245AED699@DM5PR12MB2533.namprd12.prod.outlook.com">
                  <div class="WordSection1">
                    <p class="MsoNormal"><o:p></o:p></p>
                    <p class="MsoNormal"><o:p> </o:p></p>
                    <p class="MsoNormal">>>> Lockdep also
                      complains about this when it is used correctly.
                      The only reason it doesn't complain here is
                      because you use an atomic+wait_event instead of a
                      locking primitive.<o:p></o:p></p>
                    <p class="MsoNormal">Agree. This approach will
                      escape the monitor of lockdep.  Its goal is to
                      block other threads when GPU recovery thread
                      start. But I couldn’t find a better method to
                      solve this problem. Do you have some suggestion? </p>
                  </div>
                </blockquote>
                <br>
                Well, completely abandon those change here.<br>
                <br>
                What we need to do is to identify where hardware access
                happens and then insert taking the read side of the GPU
                reset lock so that we don't wait for a dma_fence or
                allocate memory, but still protect the hardware from
                concurrent access and reset.<br>
                <br>
                Regards,<br>
                Christian.<br>
                <br>
                <blockquote type="cite" cite="mid:DM5PR12MB253379E8C89D8A20C8A0245AED699@DM5PR12MB2533.namprd12.prod.outlook.com">
                  <div class="WordSection1">
                    <p class="MsoNormal"><o:p></o:p></p>
                    <p class="MsoNormal"><o:p> </o:p></p>
                    <p class="MsoNormal">Best Regards<o:p></o:p></p>
                    <p class="MsoNormal">Dennis Li<o:p></o:p></p>
                    <p class="MsoNormal"><o:p> </o:p></p>
                    <div>
                      <div style="border:none;border-top:solid #E1E1E1
                        1.0pt;padding:3.0pt 0in 0in 0in">
                        <p class="MsoNormal"><b>From:</b> Koenig,
                          Christian <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com" moz-do-not-send="true"><Christian.Koenig@amd.com></a>
                          <br>
                          <b>Sent:</b> Thursday, March 18, 2021 4:59 PM<br>
                          <b>To:</b> Li, Dennis <a class="moz-txt-link-rfc2396E" href="mailto:Dennis.Li@amd.com" moz-do-not-send="true"><Dennis.Li@amd.com></a>;
                          <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>;
                          Deucher, Alexander <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com" moz-do-not-send="true"><Alexander.Deucher@amd.com></a>;
                          Kuehling, Felix <a class="moz-txt-link-rfc2396E" href="mailto:Felix.Kuehling@amd.com" moz-do-not-send="true"><Felix.Kuehling@amd.com></a>;
                          Zhang, Hawking <a class="moz-txt-link-rfc2396E" href="mailto:Hawking.Zhang@amd.com" moz-do-not-send="true"><Hawking.Zhang@amd.com></a><br>
                          <b>Subject:</b> AW: [PATCH 0/4] Refine GPU
                          recovery sequence to enhance its stability<o:p></o:p></p>
                      </div>
                    </div>
                    <p class="MsoNormal"><o:p> </o:p></p>
                    <div>
                      <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                          UI",sans-serif;color:black">Exactly
                          that's what you don't seem to understand.<o:p></o:p></span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                          UI",sans-serif;color:black"><o:p> </o:p></span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                          UI",sans-serif;color:black">The GPU reset
                          doesn't complete the fences we wait for. It
                          only completes the hardware fences as part of
                          the reset.<o:p></o:p></span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                          UI",sans-serif;color:black"><o:p> </o:p></span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                          UI",sans-serif;color:black">So waiting
                          for a fence while holding the reset lock is
                          illegal and needs to be avoided.<o:p></o:p></span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                          UI",sans-serif;color:black"><o:p> </o:p></span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                          UI",sans-serif;color:black">Lockdep also
                          complains about this when it is used
                          correctly. The only reason it doesn't complain
                          here is because you use an atomic+wait_event
                          instead of a locking primitive.<o:p></o:p></span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                          UI",sans-serif;color:black"><o:p> </o:p></span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                          UI",sans-serif;color:black">Regards,<o:p></o:p></span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Segoe
                          UI",sans-serif;color:black">Christian.<o:p></o:p></span></p>
                    </div>
                    <div>
                      <p class="MsoNormal"><o:p> </o:p></p>
                    </div>
                    <div class="MsoNormal" style="text-align:center" align="center">
                      <hr width="98%" size="2" align="center"> </div>
                    <div id="divRplyFwdMsg">
                      <p class="MsoNormal"><b><span style="color:black">Von:</span></b><span style="color:black"> Li, Dennis <<a href="mailto:Dennis.Li@amd.com" moz-do-not-send="true">Dennis.Li@amd.com</a>><br>
                          <b>Gesendet:</b> Donnerstag, 18. März 2021
                          09:28<br>
                          <b>An:</b> Koenig, Christian <<a href="mailto:Christian.Koenig@amd.com" moz-do-not-send="true">Christian.Koenig@amd.com</a>>;
                          <a href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
                          <<a href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>>;
                          Deucher, Alexander <<a href="mailto:Alexander.Deucher@amd.com" moz-do-not-send="true">Alexander.Deucher@amd.com</a>>;
                          Kuehling, Felix <<a href="mailto:Felix.Kuehling@amd.com" moz-do-not-send="true">Felix.Kuehling@amd.com</a>>;
                          Zhang, Hawking <<a href="mailto:Hawking.Zhang@amd.com" moz-do-not-send="true">Hawking.Zhang@amd.com</a>><br>
                          <b>Betreff:</b> RE: [PATCH 0/4] Refine GPU
                          recovery sequence to enhance its stability</span>
                        <o:p></o:p></p>
                      <div>
                        <p class="MsoNormal"> <o:p></o:p></p>
                      </div>
                    </div>
                    <div>
                      <div>
                        <p class="MsoNormal">>>> Those two
                          steps need to be exchanged or otherwise it is
                          possible that new delayed work items etc are
                          started before the lock is taken.<br>
                          What about adding check for
                          adev->in_gpu_reset in work item? If
                          exchange the two steps, it maybe introduce the
                          deadlock.  For example, the user thread hold
                          the read lock and waiting for the fence, if
                          recovery thread try to hold write lock and
                          then complete fences, in this case, recovery
                          thread will always be blocked.<o:p></o:p></p>
                      </div>
                      <div>
                        <p class="MsoNormal" style="margin-bottom:12.0pt"><br>
                          Best Regards<br>
                          Dennis Li<br>
                          -----Original Message-----<br>
                          From: Koenig, Christian <<a href="mailto:Christian.Koenig@amd.com" moz-do-not-send="true">Christian.Koenig@amd.com</a>>
                          <br>
                          Sent: Thursday, March 18, 2021 3:54 PM<br>
                          To: Li, Dennis <<a href="mailto:Dennis.Li@amd.com" moz-do-not-send="true">Dennis.Li@amd.com</a>>;
                          <a href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">
                            amd-gfx@lists.freedesktop.org</a>; Deucher,
                          Alexander <<a href="mailto:Alexander.Deucher@amd.com" moz-do-not-send="true">Alexander.Deucher@amd.com</a>>;
                          Kuehling, Felix <<a href="mailto:Felix.Kuehling@amd.com" moz-do-not-send="true">Felix.Kuehling@amd.com</a>>;
                          Zhang, Hawking <<a href="mailto:Hawking.Zhang@amd.com" moz-do-not-send="true">Hawking.Zhang@amd.com</a>><br>
                          Subject: Re: [PATCH 0/4] Refine GPU recovery
                          sequence to enhance its stability<br>
                          <br>
                          Am 18.03.21 um 08:23 schrieb Dennis Li:<br>
                          > We have defined two variables
                          in_gpu_reset and reset_sem in adev object. The
                          atomic type variable in_gpu_reset is used to
                          avoid recovery thread reenter and make lower
                          functions return more earlier when recovery
                          start, but couldn't block recovery thread when
                          it access hardware. The r/w semaphore
                          reset_sem is used to solve these
                          synchronization issues between recovery thread
                          and other threads.<br>
                          ><br>
                          > The original solution locked registers'
                          access in lower functions, which will
                          introduce following issues:<br>
                          ><br>
                          > 1) many lower functions are used in both
                          recovery thread and others. Firstly we must
                          harvest these functions, it is easy to miss
                          someones. Secondly these functions need select
                          which lock (read lock or write lock) will be
                          used, according to the thread it is running
                          in. If the thread context isn't considered,
                          the added lock will easily introduce deadlock.
                          Besides that, in most time, developer easily
                          forget to add locks for new functions.<br>
                          ><br>
                          > 2) performance drop. More lower functions
                          are more frequently called.<br>
                          ><br>
                          > 3) easily introduce false positive
                          lockdep complaint, because write lock has big
                          range in recovery thread, but low level
                          functions will hold read lock may be protected
                          by other locks in other threads.<br>
                          ><br>
                          > Therefore the new solution will try to
                          add lock protection for ioctls of kfd. Its
                          goal is that there are no threads except for
                          recovery thread or its children (for xgmi) to
                          access hardware when doing GPU reset and
                          resume. So refine recovery thread as the
                          following:<br>
                          ><br>
                          > Step 0:
                          atomic_cmpxchg(&adev->in_gpu_reset, 0,
                          1)<br>
                          >     1). if failed, it means system had a
                          recovery thread running, current thread exit
                          directly;<br>
                          >     2). if success, enter recovery
                          thread;<br>
                          ><br>
                          > Step 1: cancel all delay works, stop drm
                          schedule, complete all unreceived fences and
                          so on. It try to stop or pause other threads.<br>
                          ><br>
                          > Step 2: call
                          down_write(&adev->reset_sem) to hold
                          write lock, which will block recovery thread
                          until other threads release read locks.<br>
                          <br>
                          Those two steps need to be exchanged or
                          otherwise it is possible that new delayed work
                          items etc are started before the lock is
                          taken.<br>
                          <br>
                          Just to make it clear until this is fixed the
                          whole patch set is a NAK.<br>
                          <br>
                          Regards,<br>
                          Christian.<br>
                          <br>
                          ><br>
                          > Step 3: normally, there is only recovery
                          threads running to access hardware, it is safe
                          to do gpu reset now.<br>
                          ><br>
                          > Step 4: do post gpu reset, such as call
                          all ips' resume functions;<br>
                          ><br>
                          > Step 5: atomic set adev->in_gpu_reset
                          as 0, wake up other threads and release write
                          lock. Recovery thread exit normally.<br>
                          ><br>
                          > Other threads call the amdgpu_read_lock
                          to synchronize with recovery thread. If it
                          finds that in_gpu_reset is 1, it should
                          release read lock if it has holden one, and
                          then blocks itself to wait for recovery
                          finished event. If thread successfully hold
                          read lock and in_gpu_reset is 0, it continues.
                          It will exit normally or be stopped by
                          recovery thread in step 1.<br>
                          ><br>
                          > Dennis Li (4):<br>
                          >    drm/amdgpu: remove reset lock from low
                          level functions<br>
                          >    drm/amdgpu: refine the GPU recovery
                          sequence<br>
                          >    drm/amdgpu: instead of using
                          down/up_read directly<br>
                          >    drm/amdkfd: add reset lock protection
                          for kfd entry functions<br>
                          ><br>
                          >  
                          drivers/gpu/drm/amd/amdgpu/amdgpu.h          
                          |   6 +<br>
                          >  
                          drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c  
                          |  14 +-<br>
                          >  
                          drivers/gpu/drm/amd/amdgpu/amdgpu_device.c   
                          | 173 +++++++++++++-----<br>
                          >  
                          .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c   
                          |   8 -<br>
                          >  
                          drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c       
                          |   4 +-<br>
                          >  
                          drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c        
                          |   9 +-<br>
                          >  
                          drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c        
                          |   5 +-<br>
                          >  
                          drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c        
                          |   5 +-<br>
                          >  
                          drivers/gpu/drm/amd/amdkfd/kfd_chardev.c     
                          | 172 ++++++++++++++++-<br>
                          >  
                          drivers/gpu/drm/amd/amdkfd/kfd_priv.h        
                          |   3 +-<br>
                          >  
                          drivers/gpu/drm/amd/amdkfd/kfd_process.c     
                          |   4 +<br>
                          >  
                          .../amd/amdkfd/kfd_process_queue_manager.c   
                          |  17 ++<br>
                          >   12 files changed, 345 insertions(+), 75
                          deletions(-)<br>
                          ><o:p></o:p></p>
                      </div>
                    </div>
                  </div>
                </blockquote>
                <br>
                <br>
                <fieldset class="mimeAttachmentHeader"></fieldset>
                <pre class="moz-quote-pre" wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cee01a18abd3b4c85742308d8f9aff67b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533881445129722%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=gGk00Js7gf1eD5flMUiJuP8Op4O1KjnqplS5X%2B2BpXw%3D&reserved=0" originalsrc="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" shash="e19VDPSJDjl0CDWElWDBmXxl+V+E2vWuyBZTe1sSoHQwveKkSV+1dhpI+CKarwJBVrsKqPUyaAw1S2AapBmfOwNI4lf4zVpldsQJgi0olKBvBNo9tXin2NxO01ANSC5HRnOpOazmGJKcRI2Jm9HZkSU1aWqd2qKTp3LoTzqUqUw=" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
              </blockquote>
            </blockquote>
            <br>
          </blockquote>
          <br>
          <fieldset class="mimeAttachmentHeader"></fieldset>
          <pre class="moz-quote-pre" wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cee01a18abd3b4c85742308d8f9aff67b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533881445139715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=nqM8PNaMcfzPwfpoO%2FBf7rwIotKB5yMDqOxSaUOqF0I%3D&reserved=0" originalsrc="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" shash="hI23A+rK2YrPnY4kwR78CmtZO+0+01791iF+WCcrK9n20wilupxopMrSWZ0LQogmWFmpq1HoPjT4iPmWRM/LMgXNNdXLIL2NO74Rd/OmsX0z5tg+C6WYl4T7FkfXi+TK37th72gGbSLCQ4Y+2fIjlenHyOju4x32EkpdvYh0Nm8=" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
        </blockquote>
        <br>
      </blockquote>
    </blockquote>
    <br>
  </body>
</html>