<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    Am 12.03.25 um 09:15 schrieb Zhang, Jesse(Jie):<br>
    <blockquote type="cite" cite="mid:DM4PR12MB515277590EC11D3757BB448FE3D02@DM4PR12MB5152.namprd12.prod.outlook.com">[SNIP9<span style="white-space: pre-wrap">
</span>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">-
+     gfx_ring->funcs->stop_queue(adev, instance_id);
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
Yeah that starts to look good. Question here is who is calling amdgpu_sdma_reset_engine()?

If this call comes from engine specific code we might not need the start/stop_queue callbacks all together.

    Kfd and sdma v4/v5/v5_2 will call amdgpu_sdma_reset_engine, and start/stop_queue callbacks are only implemented in sdmav4/sdmav5/sdma5_2.</pre>
    </blockquote>
    <br>
    Why would the KFD call this as well? Because it detects an issue
    with a SDMA user queue  If yes I would rather suggest that the KFD
    calls the reset function of the paging queue.<br>
    <br>
    Since this reset function is specific to the SDMA HW generation
    anyway you don't need those extra functions to abstract starting and
    stopping of the queue for each HW generation.<br>
    <br>
    Regards,<br>
    Christian.<br>
    <br>
    <blockquote type="cite" cite="mid:DM4PR12MB515277590EC11D3757BB448FE3D02@DM4PR12MB5152.namprd12.prod.outlook.com">
      <pre class="moz-quote-pre" wrap="">

Thanks
Jesse

Regards,
Christian.

</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">      /* Perform the SDMA reset for the specified instance */
      ret = amdgpu_dpm_reset_sdma(adev, 1 << instance_id);
      if (ret) {
@@ -591,18 +573,7 @@ int amdgpu_sdma_reset_engine(struct amdgpu_device *adev, uint32_t instance_id, b
              goto exit;
      }

-     /* Invoke all registered post_reset callbacks */
-     list_for_each_entry(funcs, &adev->sdma.reset_callback_list, list) {
-             if (funcs->post_reset) {
-                     ret = funcs->post_reset(adev, instance_id);
-                     if (ret) {
-                             dev_err(adev->dev,
-                             "afterReset callback failed for instance %u: %d\n",
-                                     instance_id, ret);
-                             goto exit;
-                     }
-             }
-     }
+     gfx_ring->funcs->start_queue(adev, instance_id);

 exit:
      /* Restart the scheduler's work queue for the GFX and page rings
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
index fd34dc138081..c1f7ccff9c4e 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
@@ -2132,6 +2132,8 @@ static const struct amdgpu_ring_funcs sdma_v4_4_2_ring_funcs = {
      .emit_reg_wait = sdma_v4_4_2_ring_emit_reg_wait,
      .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper,
      .reset = sdma_v4_4_2_reset_queue,
+     .stop_queue = sdma_v4_4_2_stop_queue,
+     .start_queue = sdma_v4_4_2_restore_queue,
      .is_guilty = sdma_v4_4_2_ring_is_guilty,  };

</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
</pre>
    </blockquote>
    <br>
  </body>
</html>