<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    Andrey needs to review the reste, but essentially I don't see the
    reason why you need this drm_sched_resubmit_jobs2().<br>
    <br>
    Christian.<br>
    <br>
    <div class="moz-cite-prefix">Am 10.03.21 um 14:36 schrieb Zhang,
      Jack (Jian):<br>
    </div>
    <blockquote type="cite" cite="mid:DM5PR1201MB02044274BD6BB85B3828074FBB919@DM5PR1201MB0204.namprd12.prod.outlook.com">
      
      <p style="font-family:Arial;font-size:11pt;color:#0078D7;margin:5pt;" align="Left">
        [AMD Official Use Only - Internal Distribution Only]<br>
      </p>
      <br>
      <div>
        <div dir="auto" style="direction: ltr; margin: 0; padding: 0;
          font-family: sans-serif; font-size: 11pt; color: black; ">
          Thanks Christian, just did the checkpatch scan.  Please see
          the V6 patch<br>
        </div>
        <div dir="auto" style="direction: ltr; margin: 0; padding: 0;
          font-family: sans-serif; font-size: 11pt; color: black; ">
          <br>
        </div>
        <div dir="auto" style="direction: ltr; margin: 0; padding: 0;
          font-family: sans-serif; font-size: 11pt; color: black; ">
          /Jack<br>
        </div>
        <div dir="auto" style="direction: ltr; margin: 0px; padding:
          0px; font-family: sans-serif; font-size: 11pt; color: black;
          text-align: left;">
          <br>
        </div>
        <div dir="auto" style="direction: ltr; margin: 0; padding: 0;
          font-family: sans-serif; font-size: 11pt; color: black; ">
          <br>
        </div>
        <hr style="display:inline-block;width:98%" tabindex="-1">
        <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b>
            Koenig, Christian <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a><br>
            <b>Sent:</b> Wednesday, March 10, 2021 8:54:53 PM<br>
            <b>To:</b> Zhang, Jack (Jian) <a class="moz-txt-link-rfc2396E" href="mailto:Jack.Zhang1@amd.com"><Jack.Zhang1@amd.com></a>;
            <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
            <a class="moz-txt-link-rfc2396E" href="mailto:amd-gfx@lists.freedesktop.org"><amd-gfx@lists.freedesktop.org></a>; Grodzovsky, Andrey
            <a class="moz-txt-link-rfc2396E" href="mailto:Andrey.Grodzovsky@amd.com"><Andrey.Grodzovsky@amd.com></a>; Liu, Monk
            <a class="moz-txt-link-rfc2396E" href="mailto:Monk.Liu@amd.com"><Monk.Liu@amd.com></a>; Deng, Emily
            <a class="moz-txt-link-rfc2396E" href="mailto:Emily.Deng@amd.com"><Emily.Deng@amd.com></a><br>
            <b>Subject:</b> Re: [PATCH v4] drm/amd/amdgpu implement tdr
            advanced mode</font>
          <div> </div>
        </div>
        <div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
              <div class="PlainText">Am 10.03.21 um 12:19 schrieb Jack
                Zhang:<br>
                > [Why]<br>
                > Previous tdr design treats the first job in
                job_timeout as the bad job.<br>
                > But sometimes a later bad compute job can block a
                good gfx job and<br>
                > cause an unexpected gfx job timeout because gfx and
                compute ring share<br>
                > internal GC HW mutually.<br>
                ><br>
                > [How]<br>
                > This patch implements an advanced tdr mode.It
                involves an additinal<br>
                > synchronous pre-resubmit step(Step0 Resubmit)
                before normal resubmit<br>
                > step in order to find the real bad job.<br>
                ><br>
                > 1. At Step0 Resubmit stage, it synchronously
                submits and pends for the<br>
                > first job being signaled. If it gets timeout, we
                identify it as guilty<br>
                > and do hw reset. After that, we would do the normal
                resubmit step to<br>
                > resubmit left jobs.<br>
                ><br>
                > 2. Re-insert Bailing job to mirror_list, and leave
                it to be handled by<br>
                > the main reset thread.<br>
                ><br>
                > 3. For whole gpu reset(vram lost), do resubmit as
                the old way.<br>
                ><br>
                > Signed-off-by: Jack Zhang
                <a class="moz-txt-link-rfc2396E" href="mailto:Jack.Zhang1@amd.com"><Jack.Zhang1@amd.com></a><br>
                > Change-Id:
                I408357f10b9034caaa1b83610e19e514c5fbaaf2<br>
                > ---<br>
                >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 73
                ++++++++++++++++++++-<br>
                >   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  2
                +-<br>
                >   drivers/gpu/drm/scheduler/sched_main.c     | 75
                ++++++++++++++++++++++<br>
                >   include/drm/gpu_scheduler.h                |  2 +<br>
                >   4 files changed, 148 insertions(+), 4
                deletions(-)<br>
                ><br>
                > diff --git
                a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
                b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<br>
                > index e247c3a2ec08..02056355a055 100644<br>
                > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<br>
                > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<br>
                > @@ -4639,7 +4639,7 @@ int
                amdgpu_device_gpu_recover(struct amdgpu_device *adev,<br>
                >        int i, r = 0;<br>
                >        bool need_emergency_restart = false;<br>
                >        bool audio_suspended = false;<br>
                > -<br>
                > +     int     tmp_vram_lost_counter;<br>
                <br>
                Please keep the empoty line between declaration and
                code.<br>
                <br>
                In general give that patch a pass with checkpath.pl
                since there are a <br>
                couple of other place which needs fixing at first
                glance.<br>
                <br>
                Christian.<br>
                <br>
                <br>
                >        /*<br>
                >         * Special case: RAS triggered and full
                reset isn't supported<br>
                >         */<br>
                > @@ -4689,9 +4689,14 @@ int
                amdgpu_device_gpu_recover(struct amdgpu_device *adev,<br>
                >                dev_info(adev->dev, "Bailing on
                TDR for s_job:%llx, as another already in progress",<br>
                >                                        job ?
                job->base.id : -1);<br>
                >   <br>
                > -             /* even we skipped this reset, still
                need to set the job to guilty */<br>
                > -             if (job)<br>
                > +             if (job) {<br>
                > +                     /* re-insert node to avoid
                memory leak */<br>
                > +                    
                spin_lock(&job->base.sched->job_list_lock);<br>
                > +                    
                list_add(&job->base.node,
                &job->base.sched->pending_list);<br>
                > +                    
                spin_unlock(&job->base.sched->job_list_lock);<br>
                > +                     /* even we skipped this
                reset, still need to set the job to guilty */<br>
                >                       
                drm_sched_increase_karma(&job->base);<br>
                > +             }<br>
                >                goto skip_recovery;<br>
                >        }<br>
                >   <br>
                > @@ -4788,6 +4793,7 @@ int
                amdgpu_device_gpu_recover(struct amdgpu_device *adev,<br>
                >                }<br>
                >        }<br>
                >   <br>
                > +     tmp_vram_lost_counter =
                atomic_read(&((adev)->vram_lost_counter));<br>
                >        /* Actual ASIC resets if needed.*/<br>
                >        /* TODO Implement XGMI hive reset logic for
                SRIOV */<br>
                >        if (amdgpu_sriov_vf(adev)) {<br>
                > @@ -4805,6 +4811,67 @@ int
                amdgpu_device_gpu_recover(struct amdgpu_device *adev,<br>
                >        /* Post ASIC reset for all devs .*/<br>
                >        list_for_each_entry(tmp_adev,
                device_list_handle, gmc.xgmi.head) {<br>
                >   <br>
                > +             if (amdgpu_gpu_recovery == 2
                &&<br>
                > +                     !(tmp_vram_lost_counter <
                atomic_read(&adev->vram_lost_counter))) {<br>
                > +<br>
                > +                     for (i = 0; i <
                AMDGPU_MAX_RINGS; ++i) {<br>
                > +                             struct amdgpu_ring
                *ring = tmp_adev->rings[i];<br>
                > +                             int ret = 0;<br>
                > +                             struct drm_sched_job
                *s_job;<br>
                > +<br>
                > +                             if (!ring ||
                !ring->sched.thread)<br>
                > +                                     continue;<br>
                > +<br>
                > +                             /* No point to
                resubmit jobs if we didn't HW reset*/<br>
                > +                             if
                (!tmp_adev->asic_reset_res && !job_signaled)
                {<br>
                > +<br>
                > +                                     s_job =
                list_first_entry_or_null(&ring->sched.pending_list,
                struct drm_sched_job, list);<br>
                > +                                     if (s_job ==
                NULL)<br>
                > +                                            
                continue;<br>
                > +<br>
                > +                                     /* clear
                job's guilty and depend the folowing step to decide the
                real one */<br>
                > +                                    
                drm_sched_reset_karma(s_job);<br>
                > +                                    
                drm_sched_resubmit_jobs2(&ring->sched, 1);<br>
                > +                                     ret =
                dma_fence_wait_timeout(s_job->s_fence->parent,
                false, ring->sched.timeout);<br>
                > +<br>
                > +                                     if (ret == 0)
                { /* timeout */<br>
                > +                                            
                DRM_ERROR("Found the real bad job! ring:%s,
                job_id:%llx\n", ring->sched.name, s_job->id);<br>
                > +                                             /*
                set guilty */<br>
                > +                                            
                drm_sched_increase_karma(s_job);<br>
                > +<br>
                > +                                             /* do
                hw reset */<br>
                > +                                             if
                (amdgpu_sriov_vf(adev)) {<br>
                >
                +                                                    
                amdgpu_virt_fini_data_exchange(adev);<br>
                >
                +                                                     r
                = amdgpu_device_reset_sriov(adev, false);<br>
                >
                +                                                     if
                (r)<br>
                >
                +                                                            
                adev->asic_reset_res = r;<br>
                > +                                             }
                else {<br>
                >
                +                                                     r 
                = amdgpu_do_asic_reset(hive, device_list_handle,
                &need_full_reset, false);<br>
                >
                +                                                     if
                (r && r == -EAGAIN)<br>
                >
                +                                                            
                goto retry;<br>
                > +                                             }<br>
                > +<br>
                > +                                             /*
                add reset counter so that the following resubmitted job
                could flush vmid */<br>
                > +                                            
                atomic_inc(&tmp_adev->gpu_reset_counter);<br>
                > +                                            
                continue;<br>
                > +                                     }<br>
                > +<br>
                > +                                     /* got the hw
                fence, signal finished fence */<br>
                > +                                    
                atomic_dec(&ring->sched.num_jobs);<br>
                > +                                    
                dma_fence_get(&s_job->s_fence->finished);<br>
                > +                                    
                dma_fence_signal(&s_job->s_fence->finished);<br>
                > +                                    
                dma_fence_put(&s_job->s_fence->finished);<br>
                > +<br>
                > +<br>
                > +                                     /* remove
                node from list and free the job */<br>
                > +                                    
                spin_lock(&ring->sched.job_list_lock);<br>
                > +                                    
                list_del_init(&s_job->node);<br>
                > +                                    
                spin_unlock(&ring->sched.job_list_lock);<br>
                > +                                    
                ring->sched.ops->free_job(s_job);<br>
                > +                             }<br>
                > +                     }<br>
                > +             }<br>
                > +<br>
                >                for (i = 0; i < AMDGPU_MAX_RINGS;
                ++i) {<br>
                >                        struct amdgpu_ring *ring =
                tmp_adev->rings[i];<br>
                >   <br>
                > diff --git
                a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
                b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
                > index 865f924772b0..9c3f4edb7532 100644<br>
                > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
                > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
                > @@ -509,7 +509,7 @@
                module_param_named(compute_multipipe,
                amdgpu_compute_multipipe, int, 0444);<br>
                >    * DOC: gpu_recovery (int)<br>
                >    * Set to enable GPU recovery mechanism (1 =
                enable, 0 = disable). The default is -1 (auto, disabled
                except SRIOV).<br>
                >    */<br>
                > -MODULE_PARM_DESC(gpu_recovery, "Enable GPU
                recovery mechanism, (1 = enable, 0 = disable, -1 =
                auto)");<br>
                > +MODULE_PARM_DESC(gpu_recovery, "Enable GPU
                recovery mechanism, (2 = advanced tdr mode, 1 = enable,
                0 = disable, -1 = auto)");<br>
                >   module_param_named(gpu_recovery,
                amdgpu_gpu_recovery, int, 0444);<br>
                >   <br>
                >   /**<br>
                > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
                b/drivers/gpu/drm/scheduler/sched_main.c<br>
                > index d82a7ebf6099..99a6a8bddd6f 100644<br>
                > --- a/drivers/gpu/drm/scheduler/sched_main.c<br>
                > +++ b/drivers/gpu/drm/scheduler/sched_main.c<br>
                > @@ -395,6 +395,81 @@ void
                drm_sched_increase_karma(struct drm_sched_job *bad)<br>
                >   }<br>
                >   EXPORT_SYMBOL(drm_sched_increase_karma);<br>
                >   <br>
                > +<br>
                > +void drm_sched_resubmit_jobs2(struct
                drm_gpu_scheduler *sched, int max)<br>
                > +{<br>
                > +     struct drm_sched_job *s_job, *tmp;<br>
                > +     uint64_t guilty_context;<br>
                > +     bool found_guilty = false;<br>
                > +     struct dma_fence *fence;<br>
                > +     int i = 0;<br>
                > +<br>
                > +     list_for_each_entry_safe(s_job, tmp,
                &sched->pending_list, list) {<br>
                > +             struct drm_sched_fence *s_fence =
                s_job->s_fence;<br>
                > +<br>
                > +             if (i>=max)<br>
                > +                     break;<br>
                > +<br>
                > +             if (!found_guilty &&
                atomic_read(&s_job->karma) >
                sched->hang_limit) {<br>
                > +                     found_guilty = true;<br>
                > +                     guilty_context =
                s_job->s_fence->scheduled.context;<br>
                > +             }<br>
                > +<br>
                > +             if (found_guilty &&
                s_job->s_fence->scheduled.context ==
                guilty_context)<br>
                > +                    
                dma_fence_set_error(&s_fence->finished,
                -ECANCELED);<br>
                > +<br>
                > +            
                dma_fence_put(s_job->s_fence->parent);<br>
                > +             fence =
                sched->ops->run_job(s_job);<br>
                > +             i++;<br>
                > +<br>
                > +             if (IS_ERR_OR_NULL(fence)) {<br>
                > +                     if (IS_ERR(fence))<br>
                > +                            
                dma_fence_set_error(&s_fence->finished,
                PTR_ERR(fence));<br>
                > +<br>
                > +                     s_job->s_fence->parent
                = NULL;<br>
                > +             } else {<br>
                > +                     s_job->s_fence->parent
                = fence;<br>
                > +             }<br>
                > +     }<br>
                > +}<br>
                > +EXPORT_SYMBOL(drm_sched_resubmit_jobs2);<br>
                > +<br>
                > +<br>
                > +<br>
                > +void drm_sched_reset_karma(struct drm_sched_job
                *bad)<br>
                > +{<br>
                > +     int i;<br>
                > +     struct drm_sched_entity *tmp;<br>
                > +     struct drm_sched_entity *entity;<br>
                > +     struct drm_gpu_scheduler *sched =
                bad->sched;<br>
                > +<br>
                > +     /* don't reset @bad's karma if it's from
                KERNEL RQ,<br>
                > +      * because sometimes GPU hang would cause
                kernel jobs (like VM updating jobs)<br>
                > +      * corrupt but keep in mind that kernel jobs
                always considered good.<br>
                > +      */<br>
                > +     if (bad->s_priority !=
                DRM_SCHED_PRIORITY_KERNEL) {<br>
                > +             atomic_set(&bad->karma, 0);<br>
                > +             for (i = DRM_SCHED_PRIORITY_MIN; i
                < DRM_SCHED_PRIORITY_KERNEL;<br>
                > +                  i++) {<br>
                > +                     struct drm_sched_rq *rq =
                &sched->sched_rq[i];<br>
                > +<br>
                > +                     spin_lock(&rq->lock);<br>
                > +                    
                list_for_each_entry_safe(entity, tmp,
                &rq->entities, list) {<br>
                > +                             if
                (bad->s_fence->scheduled.context ==<br>
                > +                                
                entity->fence_context) {<br>
                > +                                             if
                (entity->guilty)<br>
                >
                +                                                    
                atomic_set(entity->guilty, 0);<br>
                > +                                     break;<br>
                > +                             }<br>
                > +                     }<br>
                > +                    
                spin_unlock(&rq->lock);<br>
                > +                     if (&entity->list !=
                &rq->entities)<br>
                > +                             break;<br>
                > +             }<br>
                > +     }<br>
                > +}<br>
                > +EXPORT_SYMBOL(drm_sched_reset_karma);<br>
                > +<br>
                >   /**<br>
                >    * drm_sched_stop - stop the scheduler<br>
                >    *<br>
                > diff --git a/include/drm/gpu_scheduler.h
                b/include/drm/gpu_scheduler.h<br>
                > index 1c815e0a14ed..01c609149731 100644<br>
                > --- a/include/drm/gpu_scheduler.h<br>
                > +++ b/include/drm/gpu_scheduler.h<br>
                > @@ -321,7 +321,9 @@ void drm_sched_wakeup(struct
                drm_gpu_scheduler *sched);<br>
                >   void drm_sched_stop(struct drm_gpu_scheduler
                *sched, struct drm_sched_job *bad);<br>
                >   void drm_sched_start(struct drm_gpu_scheduler
                *sched, bool full_recovery);<br>
                >   void drm_sched_resubmit_jobs(struct
                drm_gpu_scheduler *sched);<br>
                > +void drm_sched_resubmit_jobs2(struct
                drm_gpu_scheduler *sched, int max);<br>
                >   void drm_sched_increase_karma(struct
                drm_sched_job *bad);<br>
                > +void drm_sched_reset_karma(struct drm_sched_job
                *bad);<br>
                >   bool drm_sched_dependency_optimized(struct
                dma_fence* fence,<br>
                >                                    struct
                drm_sched_entity *entity);<br>
                >   void drm_sched_fault(struct drm_gpu_scheduler
                *sched);<br>
                <br>
              </div>
            </span></font></div>
      </div>
    </blockquote>
    <br>
  </body>
</html>