<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">2025年1月8日 18:02,Lazar, Lijo <<a href="mailto:lijo.lazar@amd.com" class="">lijo.lazar@amd.com</a>> 写道:</div><br class="Apple-interchange-newline"><div class=""><meta charset="UTF-8" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">On 1/8/2025 2:26 PM, Jiang Liu wrote:</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;" class="">If error happens before amdgpu_fence_driver_hw_init() gets called during<br class="">device probe, it will trigger a false warning in amdgpu_irq_put() as<br class="">below:<br class="">[ 1209.300996] ------------[ cut here ]------------<br class="">[ 1209.301061] WARNING: CPU: 48 PID: 293 at /tmp/amd.Rc9jFrl7/amd/amdgpu/amdgpu_irq.c:633 amdgpu_irq_put+0x45/0x70 [amdgpu]<br class="">[ 1209.301062] Modules linked in: ...<br class="">[ 1209.301093] CPU: 48 PID: 293 Comm: kworker/48:1 Kdump: loaded Tainted: G        W  OE     5.10.134-17.2.al8.x86_64 #1<br class="">[ 1209.301094] Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 3.0.ES.AL.P.087.05 04/07/2024<br class="">[ 1209.301095] Workqueue: events work_for_cpu_fn<br class="">[ 1209.301159] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu]<br class="">[ 1209.301160] Code: 48 8b 4e 10 48 83 39 00 74 2c 89 d1 48 8d 04 88 8b 08 85 c9 74 14 f0 ff 08 b8 00 00 00 00 74 05 c3 cc cc cc cc e9 8b fd ff ff <0f> 0b b8 ea ff ff ff c3 cc cc cc cc b8 ea ff ff ff c3 cc cc cc cc<br class="">[ 1209.301162] RSP: 0018:ffffb08a99c8fd88 EFLAGS: 00010246<br class="">[ 1209.301162] RAX: ffff9efe1bcbf500 RBX: ffff9efe1cc3e400 RCX: 0000000000000000<br class="">[ 1209.301163] RDX: 0000000000000000 RSI: ffff9efe1cc3b108 RDI: ffff9efe1cc00000<br class="">[ 1209.301163] RBP: ffff9efe1cc10818 R08: 0000000000000001 R09: 000000000000000d<br class="">[ 1209.301164] R10: ffffb08a99c8fb48 R11: ffffffffa2068018 R12: ffff9efe1cc109d0<br class="">[ 1209.301164] R13: ffff9efe1cc00010 R14: ffff9efe1cc00000 R15: ffff9efe1cc3b108<br class="">[ 1209.301165] FS:  0000000000000000(0000) GS:ffff9ff9fce00000(0000) knlGS:0000000000000000<br class="">[ 1209.301165] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033<br class="">[ 1209.301165] CR2: 00007fd0f6e860d0 CR3: 0000010092baa003 CR4: 0000000002770ee0<br class="">[ 1209.301166] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000<br class="">[ 1209.301166] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400<br class="">[ 1209.301167] PKRU: 55555554<br class="">[ 1209.301167] Call Trace:<br class="">[ 1209.301225]  amdgpu_fence_driver_hw_fini+0xda/0x110 [amdgpu]<br class="">[ 1209.301284]  amdgpu_device_fini_hw+0xaf/0x200 [amdgpu]<br class="">[ 1209.301342]  amdgpu_driver_load_kms+0x7f/0xc0 [amdgpu]<br class="">[ 1209.301400]  amdgpu_pci_probe+0x1cd/0x4a0 [amdgpu]<br class="">[ 1209.301401]  local_pci_probe+0x40/0xa0<br class="">[ 1209.301402]  work_for_cpu_fn+0x13/0x20<br class="">[ 1209.301403]  process_one_work+0x1ad/0x380<br class="">[ 1209.301404]  worker_thread+0x1c8/0x310<br class="">[ 1209.301405]  ? process_one_work+0x380/0x380<br class="">[ 1209.301406]  kthread+0x118/0x140<br class="">[ 1209.301407]  ? __kthread_bind_mask+0x60/0x60<br class="">[ 1209.301408]  ret_from_fork+0x1f/0x30<br class="">[ 1209.301410] ---[ end trace 733f120fe2ab13e5 ]---<br class="">[ 1209.301418] ------------[ cut here ]------------<br class=""><br class="">Signed-off-by: Jiang Liu <<a href="mailto:gerry@linux.alibaba.com" class="">gerry@linux.alibaba.com</a>><br class="">---<br class="">drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 9 +++++++--<br class="">drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h  | 1 +<br class="">2 files changed, 8 insertions(+), 2 deletions(-)<br class=""><br class="">diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c<br class="">index b5e87b515139..0e41a535e05f 100644<br class="">--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c<br class="">+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c<br class="">@@ -614,9 +614,11 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev)<br class=""><br class=""><span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span>if (!drm_dev_is_unplugged(adev_to_drm(adev)) &&<br class=""><span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>   ring->fence_drv.irq_src &&<br class="">-<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>   amdgpu_fence_need_ring_interrupt_restore(ring))<br class="">+<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>   ring->fence_drv.irq_enabled) {<br class=""><span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>amdgpu_irq_put(adev, ring->fence_drv.irq_src,<br class=""><span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>      ring->fence_drv.irq_type);<br class="">+<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>       ring->fence_drv.irq_enabled = false;<br class="">+<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span>}<br class=""><br class=""><span class="Apple-tab-span" style="white-space: pre;">       </span><span class="Apple-tab-span" style="white-space: pre;">  </span>del_timer_sync(&ring->fence_drv.fallback_timer);<br class=""><span class="Apple-tab-span" style="white-space: pre;">      </span>}<br class="">@@ -693,9 +695,12 @@ void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev)<br class=""><br class=""><span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span>/* enable the interrupt */<br class=""><span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span>if (ring->fence_drv.irq_src &&<br class="">-<span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>   amdgpu_fence_need_ring_interrupt_restore(ring))<br class="">+<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>   !ring->fence_drv.irq_enabled &&<br class="">+<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>   amdgpu_fence_need_ring_interrupt_restore(ring)) {<br class=""><span class="Apple-tab-span" style="white-space: pre;">   </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>amdgpu_irq_get(adev, ring->fence_drv.irq_src,<br class=""><span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>      ring->fence_drv.irq_type);<br class="">+<span class="Apple-tab-span" style="white-space: pre;">    </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-converted-space"> </span>       ring->fence_drv.irq_enabled = true;<br class="">+<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span>}<br class=""></blockquote><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">I guess the problem is more generic like calling fence driver hw_fini()</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">when hw_init is not called.</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""></div></blockquote>You are so smart:)<br class=""><div>I’m working on another patch set to fix these generic issues by tweaking the ip block and ras block state machine.</div><div>Thanks,</div><div>Gerry</div><br class=""><blockquote type="cite" class=""><div class=""><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">Thanks,</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">Lijo</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><blockquote type="cite" style="font-family: Helvetica; font-size: 18px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span class="Apple-tab-span" style="white-space: pre;">   </span>}<br class="">}<br class=""><br class="">diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h<br class="">index dee5a1b4e572..959d474a0516 100644<br class="">--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h<br class="">+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h<br class="">@@ -118,6 +118,7 @@ struct amdgpu_fence_driver {<br class=""><span class="Apple-tab-span" style="white-space: pre;">     </span>uint32_t<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>sync_seq;<br class=""><span class="Apple-tab-span" style="white-space: pre;">    </span>atomic_t<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>last_seq;<br class=""><span class="Apple-tab-span" style="white-space: pre;">    </span>bool<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>initialized;<br class="">+<span class="Apple-tab-span" style="white-space: pre;">        </span>bool<span class="Apple-tab-span" style="white-space: pre;">      </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>irq_enabled;<br class=""><span class="Apple-tab-span" style="white-space: pre;"> </span>struct amdgpu_irq_src<span class="Apple-tab-span" style="white-space: pre;">     </span><span class="Apple-tab-span" style="white-space: pre;">  </span>*irq_src;<br class=""><span class="Apple-tab-span" style="white-space: pre;">    </span>unsigned<span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span><span class="Apple-tab-span" style="white-space: pre;">  </span>irq_type;<br class=""><span class="Apple-tab-span" style="white-space: pre;">    </span>struct timer_list<span class="Apple-tab-span" style="white-space: pre;"> </span><span class="Apple-tab-span" style="white-space: pre;">  </span>fallback_timer;</blockquote></div></blockquote></div><br class=""></body></html>