<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p><br>
</p>
<div class="moz-cite-prefix">On 10/25/19 11:57 AM, Koenig, Christian wrote:<br>
</div>
<blockquote type="cite" cite="mid:2e2ebf73-9a25-5ad2-78e7-07c8b1db1b37@amd.com">
<div class="moz-cite-prefix">Am 25.10.19 um 17:35 schrieb Grodzovsky, Andrey:<br>
</div>
<blockquote type="cite" cite="mid:971115b1-6208-1dd5-d99f-c9377663a80b@amd.com">
<p><br>
</p>
<div class="moz-cite-prefix">On 10/25/19 5:26 AM, Koenig, Christian wrote:<br>
</div>
<blockquote type="cite" cite="mid:a9789f76-3ba5-ad71-1507-5eac6f589b82@amd.com">
<div class="moz-cite-prefix">Am 25.10.19 um 11:22 schrieb S, Shirish:<br>
</div>
<blockquote type="cite" cite="mid:a1c31f37-128f-51b1-f747-fe75d78d4214@amd.com">
<p><br>
</p>
<div class="moz-cite-prefix">On 10/25/2019 2:23 PM, Koenig, Christian wrote:<br>
</div>
<blockquote type="cite" cite="mid:2505c476-9e10-f70e-355c-33765a37d607@amd.com">
<div class="moz-cite-prefix">
<blockquote type="cite">
<p>amdgpu_do_asic_reset starting to resume blocks<br>
</p>
<p>...<br>
</p>
<p>amdgpu 0000:03:00.0: couldn't schedule ib on ring <sdma0><br>
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)<br>
...</p>
<p>amdgpu_device_ip_resume_phase2 resumed gfx_v9_0<br>
</p>
<p>amdgpu_device_ip_resume_phase2 resumed sdma_v4_0</p>
amdgpu_device_ip_resume_phase2 resumed powerplay</blockquote>
<br>
This is what's the root of the problem.<br>
<br>
The scheduler should never be resumed before we are done with bringing back the hardware into an usable state.<br>
</div>
</blockquote>
<p>I dont see the scheduler being resumed when the ib is scheduled, its done way after the hardware is ready in reset code path.<br>
</p>
<p>Below are the logs:</p>
<blockquote>amdgpu 0000:03:00.0: GPU reset begin!<br>
amdgpu_device_gpu_recover calling drm_sched_stop             <==<br>
...<br>
amdgpu 0000:03:00.0: couldn't schedule ib on ring <sdma0><br>
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)<br>
...<br>
amdgpu_device_ip_resume_phase2 resumed sdma_v4_0<br>
amdgpu_device_ip_resume_phase2 resumed powerplay<br>
amdgpu_device_ip_resume_phase2 resumed dm<br>
...<br>
[drm] recover vram bo from shadow done<br>
amdgpu_device_gpu_recover calling  drm_sched_start         <==<br>
...</blockquote>
<p>As mentioned in the call trace, drm_sched_main() is responsible for this job_run which seems to be called during cleanup.</p>
</blockquote>
<br>
Then the scheduler isn't stopped for some reason and we need to investigate why.<br>
<br>
We used to have another kthread_park()/unpark() in drm_sched_entity_fini(), maybe an application is crashing while we are trying to reset the GPU?<br>
</blockquote>
<p><br>
</p>
<p>We still have it and isn't doing kthread_park()/unpark() from drm_sched_entity_fini while GPU reset in progress defeats all the purpose of drm_sched_stop->kthread_park ? If drm_sched_entity_fini-> kthread_unpark happens AFTER drm_sched_stop->kthread_park
 nothing prevents from another (third) thread keep submitting job to HW which will be picked up by the unparked scheduler thread try to submit to HW but fail because the HW ring is deactivated.
<br>
</p>
<p>If so maybe we should serialize calls to kthread_park/unpark(sched->thread) ?<br>
</p>
</blockquote>
<br>
Yeah, that was my thinking as well. Probably best to just grab the reset lock before calling drm_sched_entity_fini().<br>
</blockquote>
<p><br>
</p>
<p>Shirish - please try locking &adev->lock_reset around calls to drm_sched_entity_fini as Christian suggests and see if this actually helps the issue.</p>
<p>Andrey</p>
<p><br>
</p>
<blockquote type="cite" cite="mid:2e2ebf73-9a25-5ad2-78e7-07c8b1db1b37@amd.com"><br>
Alternative I think we could change the kthread_park/unpark to a wait_event_.... in drm_sched_entity_fini().<br>
<br>
Regards,<br>
Christian.<br>
<br>
<blockquote type="cite" cite="mid:971115b1-6208-1dd5-d99f-c9377663a80b@amd.com">
<p>Andrey</p>
<p><br>
</p>
<blockquote type="cite" cite="mid:a9789f76-3ba5-ad71-1507-5eac6f589b82@amd.com"><br>
Would be rather unlikely, especially that would be rather hard to reproduce but currently my best bet what's going wrong here.<br>
<br>
Regards,<br>
Christian.<br>
<br>
<blockquote type="cite" cite="mid:a1c31f37-128f-51b1-f747-fe75d78d4214@amd.com">
<p>Regards,</p>
<p>Shirish S<br>
</p>
<blockquote type="cite" cite="mid:2505c476-9e10-f70e-355c-33765a37d607@amd.com">
<div class="moz-cite-prefix"><br>
Regards,<br>
Christian.<br>
<br>
Am 25.10.19 um 10:50 schrieb S, Shirish:<br>
</div>
<blockquote type="cite" cite="mid:b5e99dc3-5658-7e48-63f7-bf9533f582f8@amd.com">
<p>Here is the call trace:</p>
<blockquote>
<p>Call Trace:<br>
 dump_stack+0x4d/0x63<br>
 amdgpu_ib_schedule+0x86/0x4b7<br>
 ? __mod_timer+0x21e/0x244<br>
 amdgpu_job_run+0x108/0x178<br>
 drm_sched_main+0x253/0x2fa<br>
 ? remove_wait_queue+0x51/0x51<br>
 ? drm_sched_cleanup_jobs.part.12+0xda/0xda<br>
 kthread+0x14f/0x157<br>
 ? kthread_park+0x86/0x86<br>
 ret_from_fork+0x22/0x40<br>
amdgpu 0000:03:00.0: couldn't schedule ib on ring <sdma0><br>
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)<br>
</p>
</blockquote>
<br>
printed via below change:
<blockquote>
<p>@@ -151,6 +152,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,<br>
        }<br>
<br>
        if (!ring->sched.ready) {<br>
+              dump_stack();<br>
                dev_err(adev->dev, "couldn't schedule ib on ring <%s>\n", ring->name);<br>
                return -EINVAL;<br>
</p>
</blockquote>
<br>
<div class="moz-cite-prefix">On 10/24/2019 10:00 PM, Christian König wrote:<br>
</div>
<blockquote type="cite" cite="mid:d573688c-0997-1928-0c56-b60a29ff7fde@gmail.com">
<div class="moz-cite-prefix">Am 24.10.19 um 17:06 schrieb Grodzovsky, Andrey:<br>
</div>
<blockquote type="cite" cite="mid:f3be329d-d350-c821-00b7-d94858335796@amd.com">
<p><br>
</p>
<div class="moz-cite-prefix">On 10/24/19 7:01 AM, Christian König wrote:<br>
</div>
<blockquote type="cite" cite="mid:23ea615d-5ef4-d0b3-a0ec-6fae67b102f2@gmail.com">
Am 24.10.19 um 12:58 schrieb S, Shirish: <br>
<blockquote type="cite">[Why] <br>
Upon GPU reset, kernel cleans up already submitted jobs <br>
via drm_sched_cleanup_jobs. <br>
This schedules ib's via drm_sched_main()->run_job, leading to <br>
race condition of rings being ready or not, since during reset <br>
rings may be suspended. <br>
</blockquote>
<br>
NAK, exactly that's what should not happen. <br>
<br>
The scheduler should be suspend while a GPU reset is in progress. <br>
<br>
So you are running into a completely different race here. <br>
</blockquote>
</blockquote>
</blockquote>
<p>Below is the series of events when the issue occurs.</p>
<p>(Note that as you & Andrey mentioned the scheduler has been suspended but the job is scheduled nonetheless.)<br>
</p>
<blockquote>
<p>amdgpu 0000:03:00.0: GPU reset begin!</p>
<p>...</p>
<p>amdgpu_device_gpu_recover stopping ring sdma0 via drm_sched_stop</p>
<p>...</p>
amdgpu 0000:03:00.0: GPU reset succeeded, trying to resume
<p>amdgpu_do_asic_reset starting to resume blocks<br>
</p>
<p>...<br>
</p>
<p>amdgpu 0000:03:00.0: couldn't schedule ib on ring <sdma0><br>
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)<br>
...</p>
<p>amdgpu_device_ip_resume_phase2 resumed gfx_v9_0<br>
</p>
<p>amdgpu_device_ip_resume_phase2 resumed sdma_v4_0</p>
<p>amdgpu_device_ip_resume_phase2 resumed powerplay<br>
</p>
<p>...<br>
</p>
</blockquote>
<p>FWIW, since the job is always NULL when "drm_sched_stop(&ring->sched, job ? &job->base : NULL);" when called during reset, all drm_sched_stop() does</p>
<p>is  cancel delayed work and park the sched->thread. There is no job list to be iterated to de-activate or remove or update fences.</p>
<p>Based on all this analysis, adding a mutex is more failsafe and less intrusive in the current code flow and lastly seems to be logical as well, hence I devised this approach</p>
<br>
<blockquote type="cite" cite="mid:d573688c-0997-1928-0c56-b60a29ff7fde@gmail.com">
<blockquote type="cite" cite="mid:f3be329d-d350-c821-00b7-d94858335796@amd.com">
<blockquote type="cite" cite="mid:23ea615d-5ef4-d0b3-a0ec-6fae67b102f2@gmail.com">
<br>
Please sync up with Andrey how this was able to happen. <br>
<br>
Regards, <br>
Christian. <br>
</blockquote>
<p><br>
</p>
<p>Shirish - Christian makes a good point - note that in amdgpu_device_gpu_recover drm_sched_stop which stop all the scheduler threads is called way before we suspend the HW in amdgpu_device_pre_asic_reset->amdgpu_device_ip_suspend where SDMA suspension is
 happening and where the HW ring marked as not ready - please provide call stack for when you hit
<span style="color: rgb(51, 51, 51); font-family:
                        Arial, sans-serif; font-size: 14px; font-style:
                        normal; font-variant-ligatures: normal;
                        font-variant-caps: normal; font-weight: 400;
                        letter-spacing: normal; orphans: 2; text-align:
                        start; text-indent: 0px; text-transform: none;
                        white-space: normal; widows: 2; word-spacing:
                        0px; -webkit-text-stroke-width: 0px;
                        background-color: rgb(245, 245, 245);
                        text-decoration-style: initial;
                        text-decoration-color: initial; display: inline
                        !important; float: none;">
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22) to identify the code path which tried to submit the SDMA IB<br>
</span></p>
</blockquote>
<br>
Well the most likely cause of this is that the hardware failed to resume after the reset.<br>
</blockquote>
<p>Infact hardware resume has not yet started, when the job is scheduled, which is the race am trying to address with this patch.</p>
<p>Regards,</p>
<p>Shirish S<br>
</p>
<blockquote type="cite" cite="mid:d573688c-0997-1928-0c56-b60a29ff7fde@gmail.com">
<br>
Christian.<br>
<br>
<blockquote type="cite" cite="mid:f3be329d-d350-c821-00b7-d94858335796@amd.com">
<p><span style="color: rgb(51, 51, 51); font-family:
                        Arial, sans-serif; font-size: 14px; font-style:
                        normal; font-variant-ligatures: normal;
                        font-variant-caps: normal; font-weight: 400;
                        letter-spacing: normal; orphans: 2; text-align:
                        start; text-indent: 0px; text-transform: none;
                        white-space: normal; widows: 2; word-spacing:
                        0px; -webkit-text-stroke-width: 0px;
                        background-color: rgb(245, 245, 245);
                        text-decoration-style: initial;
                        text-decoration-color: initial; display: inline
                        !important; float: none;"></span></p>
<p><span style="color: rgb(51, 51, 51); font-family:
                        Arial, sans-serif; font-size: 14px; font-style:
                        normal; font-variant-ligatures: normal;
                        font-variant-caps: normal; font-weight: 400;
                        letter-spacing: normal; orphans: 2; text-align:
                        start; text-indent: 0px; text-transform: none;
                        white-space: normal; widows: 2; word-spacing:
                        0px; -webkit-text-stroke-width: 0px;
                        background-color: rgb(245, 245, 245);
                        text-decoration-style: initial;
                        text-decoration-color: initial; display: inline
                        !important; float: none;">Andrey</span></p>
<p><span style="color: rgb(51, 51, 51); font-family:
                        Arial, sans-serif; font-size: 14px; font-style:
                        normal; font-variant-ligatures: normal;
                        font-variant-caps: normal; font-weight: 400;
                        letter-spacing: normal; orphans: 2; text-align:
                        start; text-indent: 0px; text-transform: none;
                        white-space: normal; widows: 2; word-spacing:
                        0px; -webkit-text-stroke-width: 0px;
                        background-color: rgb(245, 245, 245);
                        text-decoration-style: initial;
                        text-decoration-color: initial; display: inline
                        !important; float: none;"><br>
</span></p>
<blockquote type="cite" cite="mid:23ea615d-5ef4-d0b3-a0ec-6fae67b102f2@gmail.com">
<br>
<blockquote type="cite"><br>
[How] <br>
make GPU reset's amdgpu_device_ip_resume_phase2() & <br>
amdgpu_ib_schedule() in amdgpu_job_run() mutually exclusive. <br>
<br>
Signed-off-by: Shirish S <a class="moz-txt-link-rfc2396E" href="mailto:shirish.s@amd.com" moz-do-not-send="true">
<shirish.s@amd.com></a> <br>
--- <br>
  drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 1 + <br>
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ <br>
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c    | 2 ++ <br>
  3 files changed, 6 insertions(+) <br>
<br>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
<br>
index f4d9041..7b07a47b 100644 <br>
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h <br>
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h <br>
@@ -973,6 +973,7 @@ struct amdgpu_device { <br>
      bool                            in_gpu_reset; <br>
      enum pp_mp1_state               mp1_state; <br>
      struct mutex  lock_reset; <br>
+    struct mutex  lock_ib_sched; <br>
      struct amdgpu_doorbell_index doorbell_index; <br>
        int asic_reset_res; <br>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
<br>
index 676cad1..63cad74 100644 <br>
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c <br>
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c <br>
@@ -2759,6 +2759,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, <br>
      mutex_init(&adev->virt.vf_errors.lock); <br>
      hash_init(adev->mn_hash); <br>
      mutex_init(&adev->lock_reset); <br>
+    mutex_init(&adev->lock_ib_sched); <br>
      mutex_init(&adev->virt.dpm_mutex); <br>
      mutex_init(&adev->psp.mutex); <br>
  @@ -3795,7 +3796,9 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive,
<br>
                  if (r) <br>
                      return r; <br>
  +                mutex_lock(&tmp_adev->lock_ib_sched); <br>
                  r = amdgpu_device_ip_resume_phase2(tmp_adev); <br>
+                mutex_unlock(&tmp_adev->lock_ib_sched); <br>
                  if (r) <br>
                      goto out; <br>
  diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
<br>
index e1bad99..cd6082d 100644 <br>
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c <br>
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c <br>
@@ -233,8 +233,10 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job)
<br>
      if (finished->error < 0) { <br>
          DRM_INFO("Skip scheduling IBs!\n"); <br>
      } else { <br>
+        mutex_lock(&ring->adev->lock_ib_sched); <br>
          r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, job, <br>
                         &fence); <br>
+        mutex_unlock(&ring->adev->lock_ib_sched); <br>
          if (r) <br>
              DRM_ERROR("Error scheduling IBs (%d)\n", r); <br>
      } <br>
</blockquote>
<br>
</blockquote>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org" moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx" moz-do-not-send="true">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a></pre>
</blockquote>
<br>
</blockquote>
</blockquote>
<br>
</blockquote>
</blockquote>
<br>
</blockquote>
</blockquote>
<br>
</blockquote>
</body>
</html>