<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    Yeah, just wanted to point out the unused variable as well.<br>
    <br>
    With that fixed the patch is Reviewed-by: Christian König
    <a class="moz-txt-link-rfc2396E" href="mailto:christian.koenig@amd.com"><christian.koenig@amd.com></a><br>
    <br>
    Regards,<br>
    Christian.<br>
    <br>
    <div class="moz-cite-prefix">Am 21.11.24 um 07:49 schrieb Fan,
      Shikang:<br>
    </div>
    <blockquote type="cite" cite="mid:SA1PR12MB7343558D8C9CB6A72BB898FFEB222@SA1PR12MB7343.namprd12.prod.outlook.com">
      
      <style type="text/css" style="display:none;">P {margin-top:0;margin-bottom:0;}</style>
      <p style="font-family:Calibri;font-size:10pt;color:#0000FF;margin:5pt;font-style:normal;font-weight:normal;text-decoration:none;" align="Left">
        [AMD Official Use Only - AMD Internal Distribution Only]<br>
      </p>
      <br>
      <div>
        <div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">
          I forgot to delete the unused counter "j" from the patch, I'll
          remove it when submit the patch to the branch.<br>
          <br>
          Thanks,</div>
        <div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);">
          Shikang</div>
        <div style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:11pt; color:rgb(0,0,0)">
          <br>
        </div>
        <hr tabindex="-1" style="display:inline-block; width:98%">
        <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b> Fan,
            Shikang <a class="moz-txt-link-rfc2396E" href="mailto:Shikang.Fan@amd.com"><Shikang.Fan@amd.com></a><br>
            <b>Sent:</b> Thursday, November 21, 2024 2:47 PM<br>
            <b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
            <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>; Koenig, Christian
            <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a><br>
            <b>Cc:</b> Deng, Emily <a class="moz-txt-link-rfc2396E" href="mailto:Emily.Deng@amd.com"><Emily.Deng@amd.com></a><br>
            <b>Subject:</b> Re: [PATCH v3] drm/amdgpu: Check fence
            emitted count to identify bad jobs</font>
          <div> </div>
        </div>
        <style type="text/css" style="display:none">p
        {margin-top:0;
        margin-bottom:0}</style>
        <div dir="ltr">
          <div class="x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:11pt; color:rgb(0,0,0)">
            +<a href="mailto:Christian.Koenig@amd.com" id="OWAAM340083" class="x_tWKOu x_mention x_ms-bgc-nlr x_ms-fcl-b" moz-do-not-send="true">@Koenig, Christian</a><br>
            <br>
            Hi Christian,<br>
            Could you please help review this patch? I removed the
            timeout wait in the function.<br>
            <br>
            Thanks,</div>
          <div class="x_elementToProof" style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:11pt; color:rgb(0,0,0)">
            Shikang</div>
          <div style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif; font-size:11pt; color:rgb(0,0,0)">
            <br>
          </div>
          <hr style="display:inline-block; width:98%">
          <div dir="ltr" id="x_divRplyFwdMsg"><span style="font-family:Calibri,sans-serif; font-size:11pt; color:rgb(0,0,0)"><b>From:</b> Shikang
              Fan <a class="moz-txt-link-rfc2396E" href="mailto:shikang.fan@amd.com"><shikang.fan@amd.com></a><br>
              <b>Sent:</b> Thursday, November 21, 2024 11:48 AM<br>
              <b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
              <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a><br>
              <b>Cc:</b> Fan, Shikang <a class="moz-txt-link-rfc2396E" href="mailto:Shikang.Fan@amd.com"><Shikang.Fan@amd.com></a>; Deng,
              Emily <a class="moz-txt-link-rfc2396E" href="mailto:Emily.Deng@amd.com"><Emily.Deng@amd.com></a><br>
              <b>Subject:</b> [PATCH v3] drm/amdgpu: Check fence emitted
              count to identify bad jobs</span>
            <div> </div>
          </div>
          <div style="font-size:11pt">In SRIOV, when host driver
            performs MODE 1 reset and notifies FLR to<br>
            guest driver, there is a small chance that there is no job
            running on hw<br>
            but the driver has not updated the pending list yet, causing
            the driver<br>
            not respond the FLR request. Modify the has_job_running
            function to<br>
            make sure if there is still running job.<br>
            <br>
            v2: Use amdgpu_fence_count_emitted to determine job running
            status.<br>
            v3: Remove the timeout wait in has_job_running<br>
            <br>
            Signed-off-by: Emily Deng <a class="moz-txt-link-rfc2396E" href="mailto:Emily.Deng@amd.com"><Emily.Deng@amd.com></a><br>
            Signed-off-by: Shikang Fan <a class="moz-txt-link-rfc2396E" href="mailto:shikang.fan@amd.com"><shikang.fan@amd.com></a><br>
            ---<br>
             drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15
            +++++++--------<br>
             1 file changed, 7 insertions(+), 8 deletions(-)<br>
            <br>
            diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
            b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<br>
            index b3ca911e55d6..f53889ce71a8 100644<br>
            --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<br>
            +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<br>
            @@ -5222,15 +5222,18 @@ static int
            amdgpu_device_reset_sriov(struct amdgpu_device *adev,<br>
             }<br>
             <br>
             /**<br>
            - * amdgpu_device_has_job_running - check if there is any
            job in mirror list<br>
            + * amdgpu_device_has_job_running - check if there is any
            unfinished job<br>
              *<br>
              * @adev: amdgpu_device pointer<br>
              *<br>
            - * check if there is any job in mirror list<br>
            + * check if there is any job running on the device when
            guest driver receives<br>
            + * FLR notification from host driver. If there are still
            jobs running, then<br>
            + * the guest driver will not respond the FLR reset.
            Instead, let the job hit<br>
            + * the timeout and guest driver then issue the reset
            request.<br>
              */<br>
             bool amdgpu_device_has_job_running(struct amdgpu_device
            *adev)<br>
             {<br>
            -       int i;<br>
            +       int i, j;<br>
                     struct drm_sched_job *job;<br>
             <br>
                     for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {<br>
            @@ -5239,11 +5242,7 @@ bool
            amdgpu_device_has_job_running(struct amdgpu_device *adev)<br>
                             if (!amdgpu_ring_sched_ready(ring))<br>
                                     continue;<br>
             <br>
            -              
            spin_lock(&ring->sched.job_list_lock);<br>
            -               job =
            list_first_entry_or_null(&ring->sched.pending_list,<br>
            -                                              struct
            drm_sched_job, list);<br>
            -              
            spin_unlock(&ring->sched.job_list_lock);<br>
            -               if (job)<br>
            +               if (amdgpu_fence_count_emitted(ring))<br>
                                     return true;<br>
                     }<br>
                     return false;<br>
            --<br>
            2.34.1<br>
            <br>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>