<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 12/4/19 2:09 AM, Ma, Le wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:CH2PR12MB4278F9759EF24F29A85D7D23F65D0@CH2PR12MB4278.namprd12.prod.outlook.com">
      
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:DengXian;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"Segoe UI Emoji";
        panose-1:2 11 5 2 4 2 4 2 2 3;}
@font-face
        {font-family:"\@DengXian";
        panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:#954F72;
        text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
        {mso-style-priority:99;
        mso-style-link:"Plain Text Char";
        margin:0in;
        margin-bottom:.0001pt;
        font-size:14.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
span.PlainTextChar
        {mso-style-name:"Plain Text Char";
        mso-style-priority:99;
        mso-style-link:"Plain Text";
        font-family:"Calibri",sans-serif;}
p.msipheadera92e061b, li.msipheadera92e061b, div.msipheadera92e061b
        {mso-style-name:msipheadera92e061b;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
span.EmailStyle21
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.EmailStyle22
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.EmailStyle23
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.EmailStyle24
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="msipheadera92e061b" style="margin:0in;margin-bottom:.0001pt"><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0078D7">[AMD
            Official Use Only - Internal Distribution Only]</span><o:p></o:p></p>
        <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext"><o:p> </o:p></span></p>
        <div>
          <div style="border:none;border-top:solid #E1E1E1
            1.0pt;padding:3.0pt 0in 0in 0in">
            <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span style="color:windowtext"> Grodzovsky, Andrey
                <a class="moz-txt-link-rfc2396E" href="mailto:Andrey.Grodzovsky@amd.com"><Andrey.Grodzovsky@amd.com></a>
                <br>
                <b>Sent:</b> Wednesday, December 4, 2019 2:44 AM<br>
                <b>To:</b> Ma, Le <a class="moz-txt-link-rfc2396E" href="mailto:Le.Ma@amd.com"><Le.Ma@amd.com></a>;
                <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>; Zhou1, Tao
                <a class="moz-txt-link-rfc2396E" href="mailto:Tao.Zhou1@amd.com"><Tao.Zhou1@amd.com></a>; Deucher, Alexander
                <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com"><Alexander.Deucher@amd.com></a>; Li, Dennis
                <a class="moz-txt-link-rfc2396E" href="mailto:Dennis.Li@amd.com"><Dennis.Li@amd.com></a>; Zhang, Hawking
                <a class="moz-txt-link-rfc2396E" href="mailto:Hawking.Zhang@amd.com"><Hawking.Zhang@amd.com></a><br>
                <b>Cc:</b> Chen, Guchun <a class="moz-txt-link-rfc2396E" href="mailto:Guchun.Chen@amd.com"><Guchun.Chen@amd.com></a><br>
                <b>Subject:</b> Re: [PATCH 07/10] drm/amdgpu: add
                concurrent baco reset support for XGMI<o:p></o:p></span></p>
          </div>
        </div>
        <p class="MsoNormal"><o:p> </o:p></p>
        <p style="margin-left:.5in">Thanks Ma, this was very helpful as
          I am sill not able to setup XGMI hive with latest FW and
          VBIOS.<o:p></o:p></p>
        <p style="margin-left:.5in">I traced the workqueue subsystem
          (full log attached). Specifically here is the life cycle of
          our 2 work items executing amdgpu_device_xgmi_reset_func
          bellow<o:p></o:p></p>
        <p><span style="font-size:12.0pt;color:#203864">[Le]: Thanks
            Andrey for the deep debug. Your feedback gave me a more
            profound understanding on this case. My comments split as
            below.<o:p></o:p></span></p>
        <p style="margin-left:.5in">You were right to note they both run
          on came CPU (32) but they are executed by different threads.
          Also as you see by workqueue_execute_start/end timestamps they
          actually ran in parallel and not one after another even while
          being assigned to the same CPU and that because of thread
          preemption (there is at least
          psp_v11_0_mode1_reset->msleep(500)) which yields the CPU
          and hence allows the second work to run + I am sure that on
          preemptive kernel one reset work would be preempted at some
          point anyway  and let the other run. <o:p></o:p></p>
        <p><span style="font-size:12.0pt;color:#203864">[Le]: Yes, from
            the trace log, the xgmi_reset_func items are assigned to
            different work threads bound to one same CPU. And you are
            right that cpu preemption will happen when msleep called
            which yield the CPU to allow second work to run. That’s a
            great founding</span><span style="font-size:12.0pt;font-family:"Segoe UI
            Emoji",sans-serif;color:#203864">😊</span><span style="font-size:12.0pt;color:#203864">. But it’s not a
            <b>real</b> parallel run to me because second work can only
            preempt to run when first work go to sleep. I made an
            experiment here to change this unique msleep to udelay, then
            second work item will run after first item finished in a
            serial execuation.</span></p>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p>I would expect in kernel compiled with preemption support that a
      running thread would be interrupted to let others run even when he
      is not voluntarily yields the CPU so this is strange.</p>
    <p> <br>
    </p>
    <blockquote type="cite" cite="mid:CH2PR12MB4278F9759EF24F29A85D7D23F65D0@CH2PR12MB4278.namprd12.prod.outlook.com">
      <div class="WordSection1">
        <p><span style="font-size:12.0pt;color:#203864"><o:p></o:p></span></p>
        <p style="margin-left:.5in">Now you had issues with BACO reset
          while the test I ran on your system is mode1 reset and so I
          assumed that maybe BACO has some non preempt-able busy wait
          which doesn't give a chance to second work item's thread to
          run on that CPU before the first finished - but from looking
          in the code I see smu_v11_0_baco_enter->msleep(10) so even
          in that case the first reset work item was supposed to yield
          CPU after BACO ENTER sent to SMU and let the other reset work
          do the same to the second card and so i don't see how even in
          this case there is a serial execution ?<o:p></o:p></p>
        <p><span style="font-size:12.0pt;color:#203864">[Le]: VG20 uses
            old powerplay framework (</span><span style="color:#203864">smu_v11_0_baco_enter->msleep(10)
            in swSMU framework</span><span style="font-size:12.0pt;color:#203864">), so no msleep and
            no CPU preemption. BACO reset has Enter/Exit 2 phases. We
            expect all the XGMI nodes enter BACO simultaneously instead
            of one after one as a serial execution, then exit BACO
            simultaneously.</span></p>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p>Well, we always can add something like bellow to force each XGMI
      reset work to let others run before going into BACO exit. We can
      also guarantee that all of the reset works will execute BACO ENTER
      before proceeding to BACO EXIT by using some kind of semaphore
      barrier along the line of this -
<a class="moz-txt-link-freetext" href="https://stackoverflow.com/questions/47522174/reusable-barrier-implementation-using-posix-semaphores">https://stackoverflow.com/questions/47522174/reusable-barrier-implementation-using-posix-semaphores</a>.
      This will also solve the #XGMI_NODES > #CPUs use case.<br>
    </p>
    <p><br>
    </p>
    <p>diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c
      b/drivers/gpu/drm/amd/amdgpu/soc15.c<br>
      index 48649f5..3e91e54 100644<br>
      --- a/drivers/gpu/drm/amd/amdgpu/soc15.c<br>
      +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c<br>
      @@ -531,6 +531,8 @@ static int soc15_asic_baco_reset(struct
      amdgpu_device *adev)<br>
                      if (pp_funcs->set_asic_baco_state(pp_handle,
      1))<br>
                              return -EIO;<br>
       <br>
      +               yield();<br>
      +<br>
                      /* exit BACO state */<br>
                      if (pp_funcs->set_asic_baco_state(pp_handle,
      0))<br>
                              return -EIO;<br>
    </p>
    <p><br>
    </p>
    <blockquote type="cite" cite="mid:CH2PR12MB4278F9759EF24F29A85D7D23F65D0@CH2PR12MB4278.namprd12.prod.outlook.com">
      <div class="WordSection1">
        <p><span style="font-size:12.0pt;color:#203864"><o:p></o:p></span></p>
        <p style="margin-left:.5in">P.S How you solution solves the case
          where the XGMI hive is bigger then number of CPUs on the
          system ? Assuming that what you say is correct and there is a
          serial execution when on the same CPU, if they hive is bigger
          then number of CPUs you will eventually get back to sending
          reset work to a CPU already executing BACO ENTER (or EXIT) for
          another device and will get the serialization problem anyway.
          <o:p></o:p></p>
        <p><span style="font-size:12.0pt;color:#203864">[Le]: Yeah, I
            also considered the situation that XGMI hive bigger than CPU
            NR. I think it’s an extreme situation and should not exist.
            However, assuming it exists, many work items scatter in
            several CPUs will be executed faster than bound to one same
            CPU, isn’t it ?</span></p>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p>AFAIK it's enough for even single one node in the hive to to fail
      the enter the BACO state on time to fail the entire hive reset
      procedure, no ?</p>
    <p>Any way - I see our discussion blocks your entire patch set - I
      think you can go ahead and commit yours way (I think you got an RB
      from Hawking) and I will look then and see if I can implement my
      method and if it works will just revert your patch.<br>
    </p>
    <p>Andrey</p>
    <p><br>
    </p>
    <blockquote type="cite" cite="mid:CH2PR12MB4278F9759EF24F29A85D7D23F65D0@CH2PR12MB4278.namprd12.prod.outlook.com">
      <div class="WordSection1">
        <p><span style="font-size:12.0pt;color:#203864"><o:p></o:p></span></p>
        <p>             cat-3002  [032] d... 33153.791829:
          workqueue_queue_work: work struct=00000000e43c1ebb
          function=amdgpu_device_xgmi_reset_func [amdgpu]
          workqueue=0000000080331d91 req_cpu=8192 cpu=32<br>
                       cat-3002  [032] d... 33153.791829:
          workqueue_activate_work: work struct 00000000e43c1ebb<br>
                       cat-3002  [032] dN.. 33153.791831:
          workqueue_queue_work: work struct=00000000e67113aa
          function=amdgpu_device_xgmi_reset_func [amdgpu]
          workqueue=0000000080331d91 req_cpu=8192 cpu=32<br>
                       cat-3002  [032] dN.. 33153.791832:
          workqueue_activate_work: work struct 00000000e67113aa<br>
             kworker/32:1H-551   [032] .... 33153.791834:
          workqueue_execute_start: work struct 00000000e43c1ebb:
          function amdgpu_device_xgmi_reset_func [amdgpu]<br>
             kworker/32:0H-175   [032] .... 33153.792087:
          workqueue_execute_start: work struct 00000000e67113aa:
          function amdgpu_device_xgmi_reset_func [amdgpu]<br>
             kworker/32:1H-551   [032] .... 33154.310948:
          workqueue_execute_end: work struct 00000000e43c1ebb<br>
             kworker/32:0H-175   [032] .... 33154.311043:
          workqueue_execute_end: work struct 00000000e67113aa<o:p></o:p></p>
        <p>Andrey<o:p></o:p></p>
        <p><o:p> </o:p></p>
        <p><o:p> </o:p></p>
        <div>
          <p class="MsoNormal">On 12/3/19 5:06 AM, Ma, Le wrote:<o:p></o:p></p>
        </div>
        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
          <p class="msipheadera92e061b" style="margin:0in;margin-bottom:.0001pt"><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0078D7">[AMD
              Official Use Only - Internal Distribution Only]</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext">Hi Andrey,</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext"> </span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext">You can try the
              XGMI system below:</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext">             
              IP: 10.67.69.53</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext">             
              U/P: jenkins/0</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext"> </span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext">The original
              drm-next kernel is installed.</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext"> </span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext">Regards,</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext">Ma Le</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext"> </span><o:p></o:p></p>
          <div>
            <div style="border:none;border-top:solid #E1E1E1
              1.0pt;padding:3.0pt 0in 0in 0in">
              <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span style="color:windowtext"> Grodzovsky, Andrey
                  <a href="mailto:Andrey.Grodzovsky@amd.com" moz-do-not-send="true"><Andrey.Grodzovsky@amd.com></a>
                  <br>
                  <b>Sent:</b> Tuesday, December 3, 2019 6:05 AM<br>
                  <b>To:</b> Ma, Le <a href="mailto:Le.Ma@amd.com" moz-do-not-send="true"><Le.Ma@amd.com></a>; <a href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">
                    amd-gfx@lists.freedesktop.org</a><br>
                  <b>Cc:</b> Chen, Guchun <a href="mailto:Guchun.Chen@amd.com" moz-do-not-send="true"><Guchun.Chen@amd.com></a>;
                  Zhou1, Tao
                  <a href="mailto:Tao.Zhou1@amd.com" moz-do-not-send="true"><Tao.Zhou1@amd.com></a>;
                  Deucher, Alexander <a href="mailto:Alexander.Deucher@amd.com" moz-do-not-send="true">
                    <Alexander.Deucher@amd.com></a>; Li, Dennis <a href="mailto:Dennis.Li@amd.com" moz-do-not-send="true"><Dennis.Li@amd.com></a>;
                  Zhang, Hawking
                  <a href="mailto:Hawking.Zhang@amd.com" moz-do-not-send="true"><Hawking.Zhang@amd.com></a><br>
                  <b>Subject:</b> Re: [PATCH 07/10] drm/amdgpu: add
                  concurrent baco reset support for XGMI</span><o:p></o:p></p>
            </div>
          </div>
          <p class="MsoNormal"> <o:p></o:p></p>
          <p> <o:p></o:p></p>
          <div>
            <p class="MsoNormal">On 12/2/19 6:42 AM, Ma, Le wrote:<o:p></o:p></p>
          </div>
          <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
            <p class="msipheadera92e061b" style="margin:0in;margin-bottom:.0001pt"><span style="font-size:10.0pt;font-family:"Arial",sans-serif;color:#0078D7">[AMD
                Official Use Only - Internal Distribution Only]</span><o:p></o:p></p>
            <p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext"> </span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:12.0pt;color:windowtext"> </span><o:p></o:p></p>
            <div>
              <div style="border:none;border-top:solid #E1E1E1
                1.0pt;padding:3.0pt 0in 0in 0in">
                <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span style="color:windowtext"> Grodzovsky, Andrey
                    <a href="mailto:Andrey.Grodzovsky@amd.com" moz-do-not-send="true"><Andrey.Grodzovsky@amd.com></a>
                    <br>
                    <b>Sent:</b> Saturday, November 30, 2019 12:22 AM<br>
                    <b>To:</b> Ma, Le <a href="mailto:Le.Ma@amd.com" moz-do-not-send="true"><Le.Ma@amd.com></a>;
                    <a href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">
                      amd-gfx@lists.freedesktop.org</a><br>
                    <b>Cc:</b> Chen, Guchun <a href="mailto:Guchun.Chen@amd.com" moz-do-not-send="true"><Guchun.Chen@amd.com></a>;
                    Zhou1, Tao
                    <a href="mailto:Tao.Zhou1@amd.com" moz-do-not-send="true"><Tao.Zhou1@amd.com></a>;
                    Deucher, Alexander <a href="mailto:Alexander.Deucher@amd.com" moz-do-not-send="true">
                      <Alexander.Deucher@amd.com></a>; Li, Dennis
                    <a href="mailto:Dennis.Li@amd.com" moz-do-not-send="true"><Dennis.Li@amd.com></a>;
                    Zhang, Hawking
                    <a href="mailto:Hawking.Zhang@amd.com" moz-do-not-send="true"><Hawking.Zhang@amd.com></a><br>
                    <b>Subject:</b> Re: [PATCH 07/10] drm/amdgpu: add
                    concurrent baco reset support for XGMI</span><o:p></o:p></p>
              </div>
            </div>
            <p class="MsoNormal"> <o:p></o:p></p>
            <p> <o:p></o:p></p>
            <div>
              <p class="MsoNormal">On 11/28/19 4:00 AM, Ma, Le wrote:<o:p></o:p></p>
            </div>
            <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText">-----Original Message-----<br>
                From: Grodzovsky, Andrey <a href="mailto:Andrey.Grodzovsky@amd.com" moz-do-not-send="true"><Andrey.Grodzovsky@amd.com></a>
                <br>
                Sent: Wednesday, November 27, 2019 11:46 PM<br>
                To: Ma, Le <a href="mailto:Le.Ma@amd.com" moz-do-not-send="true"><Le.Ma@amd.com></a>; <a href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">
                  amd-gfx@lists.freedesktop.org</a><br>
                Cc: Chen, Guchun <a href="mailto:Guchun.Chen@amd.com" moz-do-not-send="true"><Guchun.Chen@amd.com></a>;
                Zhou1, Tao
                <a href="mailto:Tao.Zhou1@amd.com" moz-do-not-send="true"><Tao.Zhou1@amd.com></a>;
                Deucher, Alexander <a href="mailto:Alexander.Deucher@amd.com" moz-do-not-send="true">
                  <Alexander.Deucher@amd.com></a>; Li, Dennis <a href="mailto:Dennis.Li@amd.com" moz-do-not-send="true"><Dennis.Li@amd.com></a>;
                Zhang, Hawking
                <a href="mailto:Hawking.Zhang@amd.com" moz-do-not-send="true"><Hawking.Zhang@amd.com></a><br>
                Subject: Re: [PATCH 07/10] drm/amdgpu: add concurrent
                baco reset support for XGMI<o:p></o:p></p>
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText">On 11/27/19 4:15 AM, Le Ma wrote:<o:p></o:p></p>
              <p class="MsoPlainText">> Currently each XGMI node
                reset wq does not run in parrallel because
                <o:p></o:p></p>
              <p class="MsoPlainText">> same work item bound to same
                cpu runs in sequence. So change to bound
                <o:p></o:p></p>
              <p class="MsoPlainText">> the xgmi_reset_work item to
                different cpus.<o:p></o:p></p>
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText">It's not the same work item, see
                more bellow<o:p></o:p></p>
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText">> <o:p></o:p></p>
              <p class="MsoPlainText">> XGMI requires all nodes enter
                into baco within very close proximity
                <o:p></o:p></p>
              <p class="MsoPlainText">> before any node exit baco. So
                schedule the xgmi_reset_work wq twice
                <o:p></o:p></p>
              <p class="MsoPlainText">> for enter/exit baco
                respectively.<o:p></o:p></p>
              <p class="MsoPlainText">> <o:p></o:p></p>
              <p class="MsoPlainText">> The default reset code path
                and methods do not change for vega20 production:<o:p></o:p></p>
              <p class="MsoPlainText">>    - baco reset without
                xgmi/ras<o:p></o:p></p>
              <p class="MsoPlainText">>    - psp reset with xgmi/ras<o:p></o:p></p>
              <p class="MsoPlainText">> <o:p></o:p></p>
              <p class="MsoPlainText">> To enable baco for XGMI/RAS
                case, both 2 conditions below are needed:<o:p></o:p></p>
              <p class="MsoPlainText">>    - amdgpu_ras_enable=2<o:p></o:p></p>
              <p class="MsoPlainText">>    - baco-supported smu
                firmware<o:p></o:p></p>
              <p class="MsoPlainText">> <o:p></o:p></p>
              <p class="MsoPlainText">> The case that PSP reset and
                baco reset coexist within an XGMI hive is
                <o:p></o:p></p>
              <p class="MsoPlainText">> not in the consideration.<o:p></o:p></p>
              <p class="MsoPlainText">> <o:p></o:p></p>
              <p class="MsoPlainText">> Change-Id:
                I9c08cf90134f940b42e20d2129ff87fba761c532<o:p></o:p></p>
              <p class="MsoPlainText">> Signed-off-by: Le Ma <<a href="mailto:le.ma@amd.com" moz-do-not-send="true"><span style="color:windowtext;text-decoration:none">le.ma@amd.com</span></a>><o:p></o:p></p>
              <p class="MsoPlainText">> ---<o:p></o:p></p>
              <p class="MsoPlainText">>  
                drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  2 +<o:p></o:p></p>
              <p class="MsoPlainText">>  
                drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 78
                ++++++++++++++++++++++++++----<o:p></o:p></p>
              <p class="MsoPlainText">>   2 files changed, 70
                insertions(+), 10 deletions(-)<o:p></o:p></p>
              <p class="MsoPlainText">> <o:p></o:p></p>
              <p class="MsoPlainText">> diff --git
                a/drivers/gpu/drm/amd/amdgpu/amdgpu.h <o:p></o:p></p>
              <p class="MsoPlainText">>
                b/drivers/gpu/drm/amd/amdgpu/amdgpu.h<o:p></o:p></p>
              <p class="MsoPlainText">> index d120fe5..08929e6 100644<o:p></o:p></p>
              <p class="MsoPlainText">> ---
                a/drivers/gpu/drm/amd/amdgpu/amdgpu.h<o:p></o:p></p>
              <p class="MsoPlainText">> +++
                b/drivers/gpu/drm/amd/amdgpu/amdgpu.h<o:p></o:p></p>
              <p class="MsoPlainText">> @@ -998,6 +998,8 @@ struct
                amdgpu_device {<o:p></o:p></p>
              <p class="MsoPlainText">>         
                int                                           pstate;<o:p></o:p></p>
              <p class="MsoPlainText">>          /* enable runtime pm
                on the device */<o:p></o:p></p>
              <p class="MsoPlainText">>         
                bool                            runpm;<o:p></o:p></p>
              <p class="MsoPlainText">> +<o:p></o:p></p>
              <p class="MsoPlainText">> +     
                bool                                        in_baco;<o:p></o:p></p>
              <p class="MsoPlainText">>   };<o:p></o:p></p>
              <p class="MsoPlainText">>   <o:p></o:p></p>
              <p class="MsoPlainText">>   static inline struct
                amdgpu_device *amdgpu_ttm_adev(struct
                <o:p></o:p></p>
              <p class="MsoPlainText">> ttm_bo_device *bdev) diff
                --git <o:p></o:p></p>
              <p class="MsoPlainText">>
                a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c <o:p></o:p></p>
              <p class="MsoPlainText">>
                b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<o:p></o:p></p>
              <p class="MsoPlainText">> index bd387bb..71abfe9 100644<o:p></o:p></p>
              <p class="MsoPlainText">> ---
                a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<o:p></o:p></p>
              <p class="MsoPlainText">> +++
                b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<o:p></o:p></p>
              <p class="MsoPlainText">> @@ -2654,7 +2654,13 @@ static
                void amdgpu_device_xgmi_reset_func(struct work_struct
                *__work)<o:p></o:p></p>
              <p class="MsoPlainText">>          struct amdgpu_device
                *adev =<o:p></o:p></p>
              <p class="MsoPlainText">>                     
                container_of(__work, struct amdgpu_device,
                xgmi_reset_work);<o:p></o:p></p>
              <p class="MsoPlainText">>   <o:p></o:p></p>
              <p class="MsoPlainText">> -      
                adev->asic_reset_res =  amdgpu_asic_reset(adev);<o:p></o:p></p>
              <p class="MsoPlainText">> +      if
                (amdgpu_asic_reset_method(adev) ==
                AMD_RESET_METHOD_BACO)<o:p></o:p></p>
              <p class="MsoPlainText">> +                 
                adev->asic_reset_res = (adev->in_baco == false) ?<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                         
                amdgpu_device_baco_enter(adev->ddev) :<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                         
                amdgpu_device_baco_exit(adev->ddev);<o:p></o:p></p>
              <p class="MsoPlainText">> +      else<o:p></o:p></p>
              <p class="MsoPlainText">> +                 
                adev->asic_reset_res = amdgpu_asic_reset(adev);<o:p></o:p></p>
              <p class="MsoPlainText">> +<o:p></o:p></p>
              <p class="MsoPlainText">>          if
                (adev->asic_reset_res)<o:p></o:p></p>
              <p class="MsoPlainText">>                     
                DRM_WARN("ASIC reset failed with error, %d for drm dev,
                %s",<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                                adev->asic_reset_res,
                adev->ddev->unique); @@ -3796,6 +3802,7 @@
                <o:p></o:p></p>
              <p class="MsoPlainText">> static int
                amdgpu_do_asic_reset(struct amdgpu_hive_info *hive,<o:p></o:p></p>
              <p class="MsoPlainText">>          struct amdgpu_device
                *tmp_adev = NULL;<o:p></o:p></p>
              <p class="MsoPlainText">>          bool need_full_reset
                = *need_full_reset_arg, vram_lost = false;<o:p></o:p></p>
              <p class="MsoPlainText">>          int r = 0;<o:p></o:p></p>
              <p class="MsoPlainText">> +      int cpu =
                smp_processor_id();<o:p></o:p></p>
              <p class="MsoPlainText">>   <o:p></o:p></p>
              <p class="MsoPlainText">>          /*<o:p></o:p></p>
              <p class="MsoPlainText">>           * ASIC reset has to
                be done on all HGMI hive nodes ASAP @@
                <o:p></o:p></p>
              <p class="MsoPlainText">> -3803,21 +3810,24 @@ static
                int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive,<o:p></o:p></p>
              <p class="MsoPlainText">>           */<o:p></o:p></p>
              <p class="MsoPlainText">>          if (need_full_reset)
                {<o:p></o:p></p>
              <p class="MsoPlainText">>                     
                list_for_each_entry(tmp_adev, device_list_handle,
                gmc.xgmi.head) {<o:p></o:p></p>
              <p class="MsoPlainText">>
                -                               /* For XGMI run all
                resets in parallel to speed up the process */<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              /*<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              * For XGMI run all resets
                in parallel to speed up the<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              * process by scheduling
                the highpri wq on different<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              * cpus. For XGMI with
                baco reset, all nodes must enter<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              * baco within close
                proximity before anyone exit.<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              */<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                               if
                (tmp_adev->gmc.xgmi.num_physical_nodes > 1) {<o:p></o:p></p>
              <p class="MsoPlainText">>
                -                                           if
                (!queue_work(system_highpri_wq,
                &tmp_adev->xgmi_reset_work))<o:p></o:p></p>
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText">Note that
                tmp_adev->xgmi_reset_work (the work item) is per
                device in XGMI hive and not the same work item. So I
                don't see why you need to explicitly queue them on
                different CPUs, they should run in parallel already.<o:p></o:p></p>
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText">Andrey<o:p></o:p></p>
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">[Le]:
                  It’s also beyond my understanding that the 2 node
                  reset work items scheduled to same cpu does not run in
                  parallel. But from the experiment result in my side,
                  the 2nd work item always run after 1st work item
                  finished. Based on this result, I changed to queue
                  them on different CPUs to make sure more XGMI nodes
                  case to run in parallel, because baco requires all
                  nodes enter baco within very close proximity.
                </span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864"> </span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">The
                  experiment code is as following for your reference.
                  When card0 worker running, card1 worker is not
                  observed to run.</span><o:p></o:p></p>
            </blockquote>
            <p> <o:p></o:p></p>
            <p>The code bellow will only test that they don't run
              concurrently - but this doesn't mean they don't run on
              different CPUs and threads,I don't have an XGMI setup at
              hand to test this theory but what if there is some locking
              dependency between them that serializes their execution ?
              Can you just add a one line print inside <span style="color:#203864">
                amdgpu_device_xgmi_reset_func </span>that prints CPU
              id, thread name/id and card number ?<o:p></o:p></p>
            <p>Andrey<o:p></o:p></p>
            <p><span style="color:#203864">[Le]: I checked if directly
                use queue_work() several times, the same CPU thread will
                be used. And the worker per CPU will execute the item
                one by one. Our goal here is to make the xgmi_reset_func
                run concurrently for XGMI BACO case. That’s why I
                schedule them on different CPUs to run parallelly. And I
                can share the XGMI system with you if you’d like to
                verify more.</span><o:p></o:p></p>
          </blockquote>
          <p> <o:p></o:p></p>
          <p>I tried today to setup XGMI 2P setup to test this but
            weren't able to load with the XGMI bridge in place (maybe
            faulty bridge) - so yea - maybe leave me your setup before
            your changes (the original code) so i can try to open some
            kernel traces that show CPU id and thread id to check this.
            It's just so weird that system_highpri_wq which is
            documented to be multi-cpu and multi-threaded wouldn't queue
            those work items to different cpus and worker threads.<o:p></o:p></p>
          <p>Andrey<o:p></o:p></p>
          <p> <o:p></o:p></p>
          <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
            <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
              <p class="MsoPlainText"><span style="color:#203864"> </span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+atomic_t
                  card0_in_baco = ATOMIC_INIT(0);</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+atomic_t
                  card1_in_baco = ATOMIC_INIT(0);</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">static
                  void amdgpu_device_xgmi_reset_func(struct work_struct
                  *__work)</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">{</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">       
                  struct amdgpu_device *adev =</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">               
                  container_of(__work, struct amdgpu_device,
                  xgmi_reset_work);</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864"> </span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+      
                  printk("lema1: card 0x%x goes into reset wq\n",
                  adev->pdev->bus->number);</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+      
                  if (adev->pdev->bus->number == 0x7) {</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+              
                  atomic_set(&card1_in_baco, 1);</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+              
                  printk("lema1: card1 in baco from card1 view\n");</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+      
                  }</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">       
                  if (amdgpu_asic_reset_method(adev) ==
                  AMD_RESET_METHOD_BACO)</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">               adev->asic_reset_res
                  = (adev->in_baco == false) ?</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">                               
                  amdgpu_device_baco_enter(adev->ddev) :</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">@@
                  -2664,6 +2673,23 @@ static void
                  amdgpu_device_xgmi_reset_func(struct work_struct
                  *__work)</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">       
                  if (adev->asic_reset_res)</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">               
                  DRM_WARN("ASIC reset failed with error, %d for drm
                  dev, %s",</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">                        
                  adev->asic_reset_res, adev->ddev->unique);</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+      
                  if (adev->pdev->bus->number == 0x4) {</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+              
                  atomic_set(&card0_in_baco, 1);</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+       
                         printk("lema1: card0 in baco from card0
                  view\n");</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+              
                  while (true)</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+                      
                  if (!!atomic_read(&card1_in_baco))</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+                              
                  break;</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+              
                  printk("lema1: card1 in baco from card0 view\n");</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+  
                      }</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+      
                  if (adev->pdev->bus->number == 0x7) {</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+              
                  while (true)</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+                      
                  if (!!atomic_read(&card0_in_baco))</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+                              
                  break;</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+              
                  printk("lema1: card0 in baco from card1 view\n");</span><o:p></o:p></p>
              <p class="MsoPlainText"><span style="color:#203864">+      
                  }</span><o:p></o:p></p>
              <p class="MsoPlainText"> <o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                          if
                (!queue_work_on(cpu, system_highpri_wq,<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                                 
                   &tmp_adev->xgmi_reset_work))<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                                                       r
                = -EALREADY;<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                          cpu =
                cpumask_next(cpu, cpu_online_mask);<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                               } else<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                                           r =
                amdgpu_asic_reset(tmp_adev);<o:p></o:p></p>
              <p class="MsoPlainText">> -<o:p></o:p></p>
              <p class="MsoPlainText">>
                -                               if (r) {<o:p></o:p></p>
              <p class="MsoPlainText">>
                -                                          
                DRM_ERROR("ASIC reset failed with error, %d for drm dev,
                %s",<o:p></o:p></p>
              <p class="MsoPlainText">>
                -                                                      
                r, tmp_adev->ddev->unique);<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              if (r)<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                                           break;<o:p></o:p></p>
              <p class="MsoPlainText">>
                -                               }<o:p></o:p></p>
              <p class="MsoPlainText">>                      }<o:p></o:p></p>
              <p class="MsoPlainText">>   <o:p></o:p></p>
              <p class="MsoPlainText">> -                   /* For
                XGMI wait for all PSP resets to complete before proceed
                */<o:p></o:p></p>
              <p class="MsoPlainText">> +                  /* For
                XGMI wait for all work to complete before proceed */<o:p></o:p></p>
              <p class="MsoPlainText">>                      if (!r)
                {<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                              
                list_for_each_entry(tmp_adev, device_list_handle,<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                                                      
                    gmc.xgmi.head) {<o:p></o:p></p>
              <p class="MsoPlainText">> @@ -3826,11 +3836,59 @@
                static int amdgpu_do_asic_reset(struct amdgpu_hive_info
                *hive,<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                                                       r
                = tmp_adev->asic_reset_res;<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                                                      
                if (r)<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                                                                  
                break;<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                     
                if(AMD_RESET_METHOD_BACO ==<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                     
                   amdgpu_asic_reset_method(tmp_adev))<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                                 
                tmp_adev->in_baco = true;<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                                           }<o:p></o:p></p>
              <p class="MsoPlainText">>  
                                               }<o:p></o:p></p>
              <p class="MsoPlainText">>                      }<o:p></o:p></p>
              <p class="MsoPlainText">> -       }<o:p></o:p></p>
              <p class="MsoPlainText">>   <o:p></o:p></p>
              <p class="MsoPlainText">> +                  /*<o:p></o:p></p>
              <p class="MsoPlainText">> +                  * For XGMI
                with baco reset, need exit baco phase by scheduling<o:p></o:p></p>
              <p class="MsoPlainText">> +                  *
                xgmi_reset_work one more time. PSP reset skips this
                phase.<o:p></o:p></p>
              <p class="MsoPlainText">> +                  * Not
                assume the situation that PSP reset and baco reset<o:p></o:p></p>
              <p class="MsoPlainText">> +                  * coexist
                within an XGMI hive.<o:p></o:p></p>
              <p class="MsoPlainText">> +                  */<o:p></o:p></p>
              <p class="MsoPlainText">> +<o:p></o:p></p>
              <p class="MsoPlainText">> +                  if (!r) {<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              cpu = smp_processor_id();<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                             
                list_for_each_entry(tmp_adev, device_list_handle,<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                     
                    gmc.xgmi.head) {<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                          if
                (tmp_adev->gmc.xgmi.num_physical_nodes > 1<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                             
                && AMD_RESET_METHOD_BACO ==<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                             
                amdgpu_asic_reset_method(tmp_adev)) {<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                     
                if (!queue_work_on(cpu,<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                                 
                system_highpri_wq,<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                     
                            &tmp_adev->xgmi_reset_work))<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                                 
                r = -EALREADY;<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                     
                if (r)<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                                 
                break;<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                     
                cpu = cpumask_next(cpu, cpu_online_mask);<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                          }<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              }<o:p></o:p></p>
              <p class="MsoPlainText">> +                  }<o:p></o:p></p>
              <p class="MsoPlainText">> +<o:p></o:p></p>
              <p class="MsoPlainText">> +                  if (!r) {<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                             
                list_for_each_entry(tmp_adev, device_list_handle,<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                     
                    gmc.xgmi.head) {<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                          if
                (tmp_adev->gmc.xgmi.num_physical_nodes > 1<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                             
                && AMD_RESET_METHOD_BACO ==<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                             
                amdgpu_asic_reset_method(tmp_adev)) {<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                     
                flush_work(&tmp_adev->xgmi_reset_work);<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                      r
                = tmp_adev->asic_reset_res;<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                     
                if (r)<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                                 
                break;<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                                     
                tmp_adev->in_baco = false;<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                          }<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              }<o:p></o:p></p>
              <p class="MsoPlainText">> +                  }<o:p></o:p></p>
              <p class="MsoPlainText">> +<o:p></o:p></p>
              <p class="MsoPlainText">> +                  if (r) {<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              DRM_ERROR("ASIC reset
                failed with error, %d for drm dev, %s",<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                                          r,
                tmp_adev->ddev->unique);<o:p></o:p></p>
              <p class="MsoPlainText">>
                +                              goto end;<o:p></o:p></p>
              <p class="MsoPlainText">> +                  }<o:p></o:p></p>
              <p class="MsoPlainText">> +      }<o:p></o:p></p>
              <p class="MsoPlainText">>   <o:p></o:p></p>
              <p class="MsoPlainText">>         
                list_for_each_entry(tmp_adev, device_list_handle,
                gmc.xgmi.head) {<o:p></o:p></p>
              <p class="MsoPlainText">>                      if
                (need_full_reset) {<o:p></o:p></p>
            </blockquote>
          </blockquote>
        </blockquote>
      </div>
    </blockquote>
  </body>
</html>