<html data-lt-installed="true">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body spellcheck="false" data-gramm="false">
    <p>Hi Andrey,</p>
    <p>I don't have any XGMI machines here, maybe you can reach out
      shaoyun for help.<br>
    </p>
    <div class="moz-cite-prefix">On 2022/1/29 上午12:57, Grodzovsky,
      Andrey wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:DM5PR12MB19474AEFB824C4C97DCD7AABEA229@DM5PR12MB1947.namprd12.prod.outlook.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <div>Just a gentle ping.</div>
      <div><br>
      </div>
      <div>Andrey</div>
      <hr style="display:inline-block;width:98%" tabindex="-1">
      <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
          face="Calibri, sans-serif" color="#000000"><b>From:</b>
          Grodzovsky, Andrey<br>
          <b>Sent:</b> 26 January 2022 10:52<br>
          <b>To:</b> Christian König
          <a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a>; Koenig, Christian
          <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a>; Lazar, Lijo
          <a class="moz-txt-link-rfc2396E" href="mailto:Lijo.Lazar@amd.com"><Lijo.Lazar@amd.com></a>; <a class="moz-txt-link-abbreviated" href="mailto:dri-devel@lists.freedesktop.org">dri-devel@lists.freedesktop.org</a>
          <a class="moz-txt-link-rfc2396E" href="mailto:dri-devel@lists.freedesktop.org"><dri-devel@lists.freedesktop.org></a>;
          <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
          <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>; Chen, JingWen
          <a class="moz-txt-link-rfc2396E" href="mailto:JingWen.Chen2@amd.com"><JingWen.Chen2@amd.com></a><br>
          <b>Cc:</b> Chen, Horace <a class="moz-txt-link-rfc2396E" href="mailto:Horace.Chen@amd.com"><Horace.Chen@amd.com></a>; Liu, Monk
          <a class="moz-txt-link-rfc2396E" href="mailto:Monk.Liu@amd.com"><Monk.Liu@amd.com></a><br>
          <b>Subject:</b> Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR
          gpu recovery with TDRs</font>
        <div> </div>
      </div>
      <div>
        <p>JingWen - could you maybe give those patches a try on SRIOV
          XGMI system ? If you see issues maybe you could let me connect
          and debug. My SRIOV XGMI system which Shayun kindly arranged
          for me is not loading the driver with my drm-misc-next branch
          even without my patches.<br>
        </p>
        <p>Andrey<br>
        </p>
        <div class="x_moz-cite-prefix">On 2022-01-17 14:21, Andrey
          Grodzovsky wrote:<br>
        </div>
        <blockquote type="cite">
          <p><br>
          </p>
          <div class="x_moz-cite-prefix">On 2022-01-17 2:17 p.m.,
            Christian König wrote:<br>
          </div>
          <blockquote type="cite">Am 17.01.22 um 20:14 schrieb Andrey
            Grodzovsky:<br>
            <blockquote type="cite">
              <p>Ping on the question</p>
            </blockquote>
            <br>
            Oh, my! That was already more than a week ago and is
            completely swapped out of my head again.<br>
            <br>
            <blockquote type="cite">
              <p>Andrey<br>
              </p>
              <div class="x_moz-cite-prefix">On 2022-01-05 1:11 p.m.,
                Andrey Grodzovsky wrote:<br>
              </div>
              <blockquote type="cite">
                <blockquote type="cite" style="color:#007cff">
                  <blockquote type="cite" style="color:#007cff">Also,
                    what about having the reset_active or in_reset flag
                    in the reset_domain itself?
                    <br>
                  </blockquote>
                  <br>
                  Of hand that sounds like a good idea. <br>
                </blockquote>
                <br>
                <br>
                What then about the adev->reset_sem semaphore ?
                Should we also move this to reset_domain ?  Both of the
                moves have functional
                <br>
                implications only for XGMI case because there will be
                contention over accessing those single instance
                variables from multiple devices
                <br>
                while now each device has it's own copy. <br>
              </blockquote>
            </blockquote>
            <br>
            Since this is a rw semaphore that should be unproblematic I
            think. It could just be that the cache line of the lock then
            plays ping/pong between the CPU cores.<br>
            <br>
            <blockquote type="cite">
              <blockquote type="cite"><br>
                What benefit the centralization into reset_domain gives
                - is it for example to prevent one device in a hive
                trying to access through MMIO another one's
                <br>
                VRAM (shared FB memory) while the other one goes through
                reset ? <br>
              </blockquote>
            </blockquote>
            <br>
            I think that this is the killer argument for a centralized
            lock, yes.<br>
          </blockquote>
          <p><br>
          </p>
          <p>np, i will add a patch with centralizing both flag into
            reset domain and resend.</p>
          <p>Andrey</p>
          <p><br>
          </p>
          <blockquote type="cite"><br>
            Christian.<br>
            <br>
            <blockquote type="cite">
              <blockquote type="cite"><br>
                Andrey </blockquote>
            </blockquote>
            <br>
          </blockquote>
        </blockquote>
      </div>
    </blockquote>
  </body>
</html>