<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 11/28/19 4:00 AM, Ma, Le wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:MN2PR12MB42859443EA78D08B295AFE0DF6470@MN2PR12MB4285.namprd12.prod.outlook.com">
      
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:#954F72;
        text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
        {mso-style-priority:99;
        mso-style-link:"Plain Text Char";
        margin:0in;
        margin-bottom:.0001pt;
        font-size:14.0pt;
        font-family:"Calibri",sans-serif;}
span.PlainTextChar
        {mso-style-name:"Plain Text Char";
        mso-style-priority:99;
        mso-style-link:"Plain Text";
        font-family:"Calibri",sans-serif;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri",sans-serif;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText">-----Original Message-----<br>
          From: Grodzovsky, Andrey <a class="moz-txt-link-rfc2396E" href="mailto:Andrey.Grodzovsky@amd.com"><Andrey.Grodzovsky@amd.com></a> <br>
          Sent: Wednesday, November 27, 2019 11:46 PM<br>
          To: Ma, Le <a class="moz-txt-link-rfc2396E" href="mailto:Le.Ma@amd.com"><Le.Ma@amd.com></a>;
          <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a><br>
          Cc: Chen, Guchun <a class="moz-txt-link-rfc2396E" href="mailto:Guchun.Chen@amd.com"><Guchun.Chen@amd.com></a>; Zhou1, Tao
          <a class="moz-txt-link-rfc2396E" href="mailto:Tao.Zhou1@amd.com"><Tao.Zhou1@amd.com></a>; Deucher, Alexander
          <a class="moz-txt-link-rfc2396E" href="mailto:Alexander.Deucher@amd.com"><Alexander.Deucher@amd.com></a>; Li, Dennis
          <a class="moz-txt-link-rfc2396E" href="mailto:Dennis.Li@amd.com"><Dennis.Li@amd.com></a>; Zhang, Hawking
          <a class="moz-txt-link-rfc2396E" href="mailto:Hawking.Zhang@amd.com"><Hawking.Zhang@amd.com></a><br>
          Subject: Re: [PATCH 07/10] drm/amdgpu: add concurrent baco
          reset support for XGMI</p>
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText">On 11/27/19 4:15 AM, Le Ma wrote:<o:p></o:p></p>
        <p class="MsoPlainText">> Currently each XGMI node reset wq
          does not run in parrallel because
          <o:p></o:p></p>
        <p class="MsoPlainText">> same work item bound to same cpu
          runs in sequence. So change to bound
          <o:p></o:p></p>
        <p class="MsoPlainText">> the xgmi_reset_work item to
          different cpus.<o:p></o:p></p>
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText">It's not the same work item, see more
          bellow<o:p></o:p></p>
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText">><o:p> </o:p></p>
        <p class="MsoPlainText">> XGMI requires all nodes enter into
          baco within very close proximity
          <o:p></o:p></p>
        <p class="MsoPlainText">> before any node exit baco. So
          schedule the xgmi_reset_work wq twice
          <o:p></o:p></p>
        <p class="MsoPlainText">> for enter/exit baco respectively.<o:p></o:p></p>
        <p class="MsoPlainText">><o:p> </o:p></p>
        <p class="MsoPlainText">> The default reset code path and
          methods do not change for vega20 production:<o:p></o:p></p>
        <p class="MsoPlainText">>    - baco reset without xgmi/ras<o:p></o:p></p>
        <p class="MsoPlainText">>    - psp reset with xgmi/ras<o:p></o:p></p>
        <p class="MsoPlainText">><o:p> </o:p></p>
        <p class="MsoPlainText">> To enable baco for XGMI/RAS case,
          both 2 conditions below are needed:<o:p></o:p></p>
        <p class="MsoPlainText">>    - amdgpu_ras_enable=2<o:p></o:p></p>
        <p class="MsoPlainText">>    - baco-supported smu firmware<o:p></o:p></p>
        <p class="MsoPlainText">><o:p> </o:p></p>
        <p class="MsoPlainText">> The case that PSP reset and baco
          reset coexist within an XGMI hive is
          <o:p></o:p></p>
        <p class="MsoPlainText">> not in the consideration.<o:p></o:p></p>
        <p class="MsoPlainText">><o:p> </o:p></p>
        <p class="MsoPlainText">> Change-Id:
          I9c08cf90134f940b42e20d2129ff87fba761c532<o:p></o:p></p>
        <p class="MsoPlainText">> Signed-off-by: Le Ma <<a href="mailto:le.ma@amd.com" moz-do-not-send="true"><span style="color:windowtext;text-decoration:none">le.ma@amd.com</span></a>><o:p></o:p></p>
        <p class="MsoPlainText">> ---<o:p></o:p></p>
        <p class="MsoPlainText">>  
          drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  2 +<o:p></o:p></p>
        <p class="MsoPlainText">>  
          drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 78
          ++++++++++++++++++++++++++----<o:p></o:p></p>
        <p class="MsoPlainText">>   2 files changed, 70
          insertions(+), 10 deletions(-)<o:p></o:p></p>
        <p class="MsoPlainText">><o:p> </o:p></p>
        <p class="MsoPlainText">> diff --git
          a/drivers/gpu/drm/amd/amdgpu/amdgpu.h <o:p></o:p></p>
        <p class="MsoPlainText">>
          b/drivers/gpu/drm/amd/amdgpu/amdgpu.h<o:p></o:p></p>
        <p class="MsoPlainText">> index d120fe5..08929e6 100644<o:p></o:p></p>
        <p class="MsoPlainText">> ---
          a/drivers/gpu/drm/amd/amdgpu/amdgpu.h<o:p></o:p></p>
        <p class="MsoPlainText">> +++
          b/drivers/gpu/drm/amd/amdgpu/amdgpu.h<o:p></o:p></p>
        <p class="MsoPlainText">> @@ -998,6 +998,8 @@ struct
          amdgpu_device {<o:p></o:p></p>
        <p class="MsoPlainText">>         
          int                                           pstate;<o:p></o:p></p>
        <p class="MsoPlainText">>          /* enable runtime pm on
          the device */<o:p></o:p></p>
        <p class="MsoPlainText">>         
          bool                            runpm;<o:p></o:p></p>
        <p class="MsoPlainText">> +<o:p></o:p></p>
        <p class="MsoPlainText">> +     
          bool                                        in_baco;<o:p></o:p></p>
        <p class="MsoPlainText">>   };<o:p></o:p></p>
        <p class="MsoPlainText">>   <o:p></o:p></p>
        <p class="MsoPlainText">>   static inline struct
          amdgpu_device *amdgpu_ttm_adev(struct
          <o:p></o:p></p>
        <p class="MsoPlainText">> ttm_bo_device *bdev) diff --git <o:p></o:p></p>
        <p class="MsoPlainText">>
          a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c <o:p></o:p></p>
        <p class="MsoPlainText">>
          b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<o:p></o:p></p>
        <p class="MsoPlainText">> index bd387bb..71abfe9 100644<o:p></o:p></p>
        <p class="MsoPlainText">> ---
          a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<o:p></o:p></p>
        <p class="MsoPlainText">> +++
          b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<o:p></o:p></p>
        <p class="MsoPlainText">> @@ -2654,7 +2654,13 @@ static void
          amdgpu_device_xgmi_reset_func(struct work_struct *__work)<o:p></o:p></p>
        <p class="MsoPlainText">>          struct amdgpu_device *adev
          =<o:p></o:p></p>
        <p class="MsoPlainText">>                     
          container_of(__work, struct amdgpu_device, xgmi_reset_work);<o:p></o:p></p>
        <p class="MsoPlainText">>   <o:p></o:p></p>
        <p class="MsoPlainText">> -       adev->asic_reset_res = 
          amdgpu_asic_reset(adev);<o:p></o:p></p>
        <p class="MsoPlainText">> +      if
          (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)<o:p></o:p></p>
        <p class="MsoPlainText">> +                 
          adev->asic_reset_res = (adev->in_baco == false) ?<o:p></o:p></p>
        <p class="MsoPlainText">> +                             
                      amdgpu_device_baco_enter(adev->ddev) :<o:p></o:p></p>
        <p class="MsoPlainText">> +                             
                      amdgpu_device_baco_exit(adev->ddev);<o:p></o:p></p>
        <p class="MsoPlainText">> +      else<o:p></o:p></p>
        <p class="MsoPlainText">> +                 
          adev->asic_reset_res = amdgpu_asic_reset(adev);<o:p></o:p></p>
        <p class="MsoPlainText">> +<o:p></o:p></p>
        <p class="MsoPlainText">>          if
          (adev->asic_reset_res)<o:p></o:p></p>
        <p class="MsoPlainText">>                      DRM_WARN("ASIC
          reset failed with error, %d for drm dev, %s",<o:p></o:p></p>
        <p class="MsoPlainText">>                                 
           adev->asic_reset_res, adev->ddev->unique); @@
          -3796,6 +3802,7 @@
          <o:p></o:p></p>
        <p class="MsoPlainText">> static int
          amdgpu_do_asic_reset(struct amdgpu_hive_info *hive,<o:p></o:p></p>
        <p class="MsoPlainText">>          struct amdgpu_device
          *tmp_adev = NULL;<o:p></o:p></p>
        <p class="MsoPlainText">>          bool need_full_reset =
          *need_full_reset_arg, vram_lost = false;<o:p></o:p></p>
        <p class="MsoPlainText">>          int r = 0;<o:p></o:p></p>
        <p class="MsoPlainText">> +      int cpu =
          smp_processor_id();<o:p></o:p></p>
        <p class="MsoPlainText">>   <o:p></o:p></p>
        <p class="MsoPlainText">>          /*<o:p></o:p></p>
        <p class="MsoPlainText">>           * ASIC reset has to be
          done on all HGMI hive nodes ASAP @@
          <o:p></o:p></p>
        <p class="MsoPlainText">> -3803,21 +3810,24 @@ static int
          amdgpu_do_asic_reset(struct amdgpu_hive_info *hive,<o:p></o:p></p>
        <p class="MsoPlainText">>           */<o:p></o:p></p>
        <p class="MsoPlainText">>          if (need_full_reset) {<o:p></o:p></p>
        <p class="MsoPlainText">>                     
          list_for_each_entry(tmp_adev, device_list_handle,
          gmc.xgmi.head) {<o:p></o:p></p>
        <p class="MsoPlainText">> -                               /*
          For XGMI run all resets in parallel to speed up the process */<o:p></o:p></p>
        <p class="MsoPlainText">> +                              /*<o:p></o:p></p>
        <p class="MsoPlainText">> +                              *
          For XGMI run all resets in parallel to speed up the<o:p></o:p></p>
        <p class="MsoPlainText">> +                              *
          process by scheduling the highpri wq on different<o:p></o:p></p>
        <p class="MsoPlainText">> +                              *
          cpus. For XGMI with baco reset, all nodes must enter<o:p></o:p></p>
        <p class="MsoPlainText">> +                              *
          baco within close proximity before anyone exit.<o:p></o:p></p>
        <p class="MsoPlainText">> +                              */<o:p></o:p></p>
        <p class="MsoPlainText">>                                  if
          (tmp_adev->gmc.xgmi.num_physical_nodes > 1) {<o:p></o:p></p>
        <p class="MsoPlainText">>
          -                                           if
          (!queue_work(system_highpri_wq,
          &tmp_adev->xgmi_reset_work))<o:p></o:p></p>
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText">Note that tmp_adev->xgmi_reset_work
          (the work item) is per device in XGMI hive and not the same
          work item. So I don't see why you need to explicitly queue
          them on different CPUs, they should run in parallel already.<o:p></o:p></p>
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText">Andrey<o:p></o:p></p>
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">[Le]:
            It’s also beyond my understanding that the 2 node reset work
            items scheduled to same cpu does not run in parallel. But
            from the experiment result in my side, the 2nd work item
            always run after 1st work item finished. Based on this
            result, I changed to queue them on different CPUs to make
            sure more XGMI nodes case to run in parallel, because baco
            requires all nodes enter baco within very close proximity. <o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%"><o:p> </o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">The
            experiment code is as following for your reference. When
            card0 worker running, card1 worker is not observed to run.</span></p>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p>The code bellow will only test that they don't run concurrently -
      but this doesn't mean they don't run on different CPUs and
      threads,I don't have an XGMI setup at hand to test this theory but
      what if there is some locking dependency between them that
      serializes their execution ? Can you just add a one line print
      inside <span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">amdgpu_device_xgmi_reset_func
      </span>that prints CPU id, thread name/id and card number ?</p>
    <p>Andrey</p>
    <p><br>
    </p>
    <blockquote type="cite" cite="mid:MN2PR12MB42859443EA78D08B295AFE0DF6470@MN2PR12MB4285.namprd12.prod.outlook.com">
      <div class="WordSection1">
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%"><o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%"><o:p> </o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+atomic_t
            card0_in_baco = ATOMIC_INIT(0);<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+atomic_t
            card1_in_baco = ATOMIC_INIT(0);<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">static
            void amdgpu_device_xgmi_reset_func(struct work_struct
            *__work)<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">{<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">       
            struct amdgpu_device *adev =<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">               
            container_of(__work, struct amdgpu_device, xgmi_reset_work);<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%"><o:p> </o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+      
            printk("lema1: card 0x%x goes into reset wq\n",
            adev->pdev->bus->number);<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+      
            if (adev->pdev->bus->number == 0x7) {<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+              
            atomic_set(&card1_in_baco, 1);<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+              
            printk("lema1: card1 in baco from card1 view\n");<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+      
            }<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">       
            if (amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">               adev->asic_reset_res
            = (adev->in_baco == false) ?<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">                               
            amdgpu_device_baco_enter(adev->ddev) :<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">@@
            -2664,6 +2673,23 @@ static void
            amdgpu_device_xgmi_reset_func(struct work_struct *__work)<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">       
            if (adev->asic_reset_res)<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">               
            DRM_WARN("ASIC reset failed with error, %d for drm dev, %s",<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">                        
            adev->asic_reset_res, adev->ddev->unique);<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+      
            if (adev->pdev->bus->number == 0x4) {<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+              
            atomic_set(&card0_in_baco, 1);<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+       
                   printk("lema1: card0 in baco from card0 view\n");<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+              
            while (true)<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+                      
            if (!!atomic_read(&card1_in_baco))<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+                              
            break;<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+              
            printk("lema1: card1 in baco from card0 view\n");<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+  
                }<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+      
            if (adev->pdev->bus->number == 0x7) {<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+              
            while (true)<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+                      
            if (!!atomic_read(&card0_in_baco))<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+                              
            break;<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+              
            printk("lema1: card0 in baco from card1 view\n");<o:p></o:p></span></p>
        <p class="MsoPlainText"><span style="color:#203864;mso-style-textfill-fill-color:#203864;mso-style-textfill-fill-alpha:100.0%">+      
            }<o:p></o:p></span></p>
        <p class="MsoPlainText"><o:p> </o:p></p>
        <p class="MsoPlainText">>
          +                                          if
          (!queue_work_on(cpu, system_highpri_wq,<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                                 
             &tmp_adev->xgmi_reset_work))<o:p></o:p></p>
        <p class="MsoPlainText">>  
                                                                 r =
          -EALREADY;<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                          cpu =
          cpumask_next(cpu, cpu_online_mask);<o:p></o:p></p>
        <p class="MsoPlainText">>                                  }
          else<o:p></o:p></p>
        <p class="MsoPlainText">>  
                                                     r =
          amdgpu_asic_reset(tmp_adev);<o:p></o:p></p>
        <p class="MsoPlainText">> -<o:p></o:p></p>
        <p class="MsoPlainText">> -                               if
          (r) {<o:p></o:p></p>
        <p class="MsoPlainText">>
          -                                           DRM_ERROR("ASIC
          reset failed with error, %d for drm dev, %s",<o:p></o:p></p>
        <p class="MsoPlainText">>
          -                                                       r,
          tmp_adev->ddev->unique);<o:p></o:p></p>
        <p class="MsoPlainText">> +                              if
          (r)<o:p></o:p></p>
        <p class="MsoPlainText">>  
                                                     break;<o:p></o:p></p>
        <p class="MsoPlainText">> -                               }<o:p></o:p></p>
        <p class="MsoPlainText">>                      }<o:p></o:p></p>
        <p class="MsoPlainText">>   <o:p></o:p></p>
        <p class="MsoPlainText">> -                   /* For XGMI
          wait for all PSP resets to complete before proceed */<o:p></o:p></p>
        <p class="MsoPlainText">> +                  /* For XGMI wait
          for all work to complete before proceed */<o:p></o:p></p>
        <p class="MsoPlainText">>                      if (!r) {<o:p></o:p></p>
        <p class="MsoPlainText">>                                 
          list_for_each_entry(tmp_adev, device_list_handle,<o:p></o:p></p>
        <p class="MsoPlainText">>  
                                                                
              gmc.xgmi.head) {<o:p></o:p></p>
        <p class="MsoPlainText">> @@ -3826,11 +3836,59 @@ static int
          amdgpu_do_asic_reset(struct amdgpu_hive_info *hive,<o:p></o:p></p>
        <p class="MsoPlainText">>  
                                                                 r =
          tmp_adev->asic_reset_res;<o:p></o:p></p>
        <p class="MsoPlainText">>  
                                                                 if (r)<o:p></o:p></p>
        <p class="MsoPlainText">>  
                                                                            
          break;<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                     
          if(AMD_RESET_METHOD_BACO ==<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                        
          amdgpu_asic_reset_method(tmp_adev))<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                                 
          tmp_adev->in_baco = true;<o:p></o:p></p>
        <p class="MsoPlainText">>  
                                                     }<o:p></o:p></p>
        <p class="MsoPlainText">>                                  }<o:p></o:p></p>
        <p class="MsoPlainText">>                      }<o:p></o:p></p>
        <p class="MsoPlainText">> -       }<o:p></o:p></p>
        <p class="MsoPlainText">>   <o:p></o:p></p>
        <p class="MsoPlainText">> +                  /*<o:p></o:p></p>
        <p class="MsoPlainText">> +                  * For XGMI with
          baco reset, need exit baco phase by scheduling<o:p></o:p></p>
        <p class="MsoPlainText">> +                  *
          xgmi_reset_work one more time. PSP reset skips this phase.<o:p></o:p></p>
        <p class="MsoPlainText">> +                  * Not assume the
          situation that PSP reset and baco reset<o:p></o:p></p>
        <p class="MsoPlainText">> +                  * coexist within
          an XGMI hive.<o:p></o:p></p>
        <p class="MsoPlainText">> +                  */<o:p></o:p></p>
        <p class="MsoPlainText">> +<o:p></o:p></p>
        <p class="MsoPlainText">> +                  if (!r) {<o:p></o:p></p>
        <p class="MsoPlainText">> +                              cpu
          = smp_processor_id();<o:p></o:p></p>
        <p class="MsoPlainText">> +                             
          list_for_each_entry(tmp_adev, device_list_handle,<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                         
          gmc.xgmi.head) {<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                          if
          (tmp_adev->gmc.xgmi.num_physical_nodes > 1<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                              &&
          AMD_RESET_METHOD_BACO ==<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                             
          amdgpu_asic_reset_method(tmp_adev)) {<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                      if
          (!queue_work_on(cpu,<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                                 
          system_highpri_wq,<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                     
                      &tmp_adev->xgmi_reset_work))<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                                 
          r = -EALREADY;<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                      if (r)<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                                 
          break;<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                      cpu =
          cpumask_next(cpu, cpu_online_mask);<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                          }<o:p></o:p></p>
        <p class="MsoPlainText">> +                              }<o:p></o:p></p>
        <p class="MsoPlainText">> +                  }<o:p></o:p></p>
        <p class="MsoPlainText">> +<o:p></o:p></p>
        <p class="MsoPlainText">> +                  if (!r) {<o:p></o:p></p>
        <p class="MsoPlainText">> +                             
          list_for_each_entry(tmp_adev, device_list_handle,<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                         
          gmc.xgmi.head) {<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                          if
          (tmp_adev->gmc.xgmi.num_physical_nodes > 1<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                              &&
          AMD_RESET_METHOD_BACO ==<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                             
          amdgpu_asic_reset_method(tmp_adev)) {<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                     
          flush_work(&tmp_adev->xgmi_reset_work);<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                      r =
          tmp_adev->asic_reset_res;<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                      if (r)<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                                 
          break;<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                                     
          tmp_adev->in_baco = false;<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                          }<o:p></o:p></p>
        <p class="MsoPlainText">> +                              }<o:p></o:p></p>
        <p class="MsoPlainText">> +                  }<o:p></o:p></p>
        <p class="MsoPlainText">> +<o:p></o:p></p>
        <p class="MsoPlainText">> +                  if (r) {<o:p></o:p></p>
        <p class="MsoPlainText">> +                             
          DRM_ERROR("ASIC reset failed with error, %d for drm dev, %s",<o:p></o:p></p>
        <p class="MsoPlainText">>
          +                                          r,
          tmp_adev->ddev->unique);<o:p></o:p></p>
        <p class="MsoPlainText">> +                              goto
          end;<o:p></o:p></p>
        <p class="MsoPlainText">> +                  }<o:p></o:p></p>
        <p class="MsoPlainText">> +      }<o:p></o:p></p>
        <p class="MsoPlainText">>   <o:p></o:p></p>
        <p class="MsoPlainText">>         
          list_for_each_entry(tmp_adev, device_list_handle,
          gmc.xgmi.head) {<o:p></o:p></p>
        <p class="MsoPlainText">>                      if
          (need_full_reset) {<o:p></o:p></p>
      </div>
    </blockquote>
  </body>
</html>