<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
Am 25.12.21 um 03:56 schrieb 周雪梅:<br>
<blockquote type="cite" cite="mid:AFYAowAGB9CyGd23omXNsqrq.1.1640400988663.Hmail.zhouxuemei@wxiat.com">
<div style="">
<div style="">
<div style="">Although radeon card fence and wait for gpu to
finish processing current batch rings,</div>
<div style="">there is still a corner case that radeon lockup
work queue may not be fully flushed,</div>
<div style="">and meanwhile the radeon_suspend_kms() function
has called pci_set_power_state() to</div>
<div style="">put device in D3hot state.</div>
<div style=""><br>
</div>
<div style="">Per PCI spec rev 4.0 on 5.3.1.4.1 D3hot State.</div>
<div style="">> Configuration and Message requests are the
only TLPs accepted by a Function in</div>
<div style="">> the D3hot state. All other received
Requests must be handled as Unsupported Requests,</div>
<div style="">> and all received Completions may optionally
be handled as Unexpected Completions.</div>
</div>
</div>
</blockquote>
<br>
Well first of all this is the completely wrong place for this. The
flush belongs into the fence code and not here.<br>
<br>
Then I don't think that this is a good idea since it might cause
deadlocks.<br>
<br>
Christian.<br>
<br>
<br>
<blockquote type="cite" cite="mid:AFYAowAGB9CyGd23omXNsqrq.1.1640400988663.Hmail.zhouxuemei@wxiat.com">
<div style="">
<div style="">
<div style=""><br>
</div>
<div style="">This issue will happen in following logs:</div>
<div style=""><br>
</div>
<div style="">1Unable to handle kernel paging request at
virtual address 00008800e0008010</div>
<div style="">CPU 0 kworker/0:3(131): Oops 0</div>
<div style="">pc = [<ffffffff811bea5c>] ra =
[<ffffffff81240844>] ps = 0000 Tainted: G W</div>
<div style="">pc is at si_gpu_check_soft_reset+0x3c/0x240</div>
<div style="">ra is at si_dma_is_lockup+0x34/0xd0</div>
<div style="">v0 = 0000000000000000 t0 = fff08800e0008010 t1
= 0000000000010000</div>
<div style="">t2 = 0000000000008010 t3 = fff00007e3c00000 t4
= fff00007e3c00258</div>
<div style="">t5 = 000000000000ffff t6 = 0000000000000001 t7
= fff00007ef078000</div>
<div style="">s0 = fff00007e3c016e8 s1 = fff00007e3c00000 s2
= fff00007e3c00018</div>
<div style="">s3 = fff00007e3c00000 s4 = fff00007fff59d80 s5
= 0000000000000000</div>
<div style="">s6 = fff00007ef07bd98</div>
<div style="">a0 = fff00007e3c00000 a1 = fff00007e3c016e8 a2
= 0000000000000008</div>
<div style="">a3 = 0000000000000001 a4 = 8f5c28f5c28f5c29 a5
= ffffffff810f4338</div>
<div style="">t8 = 0000000000000275 t9 = ffffffff809b66f8
t10 = ff6769c5d964b800</div>
<div style="">t11= 000000000000b886 pv = ffffffff811bea20 at
= 0000000000000000</div>
<div style="">gp = ffffffff81d89690 sp = 00000000aa814126</div>
<div style="">4Disabling lock debugging due to kernel taint</div>
<div style="">Trace:</div>
<div style="">[<ffffffff81240844>]
si_dma_is_lockup+0x34/0xd0</div>
<div style="">[<ffffffff81119610>]
radeon_fence_check_lockup+0xd0/0x290</div>
<div style="">[<ffffffff80977010>]
process_one_work+0x280/0x550</div>
<div style="">[<ffffffff80977350>]
worker_thread+0x70/0x7c0</div>
<div style="">[<ffffffff80977410>]
worker_thread+0x130/0x7c0</div>
<div style="">[<ffffffff80982040>] kthread+0x200/0x210</div>
<div style="">[<ffffffff809772e0>]
worker_thread+0x0/0x7c0</div>
<div style="">[<ffffffff80981f8c>] kthread+0x14c/0x210</div>
<div style="">[<ffffffff80911658>]
ret_from_kernel_thread+0x18/0x20</div>
<div style="">[<ffffffff80981e40>] kthread+0x0/0x210</div>
<div style=""><br>
</div>
<div style=""> Code: ad3e0008 43f0074a ad7e0018 ad9e0020
8c3001e8 40230101</div>
<div style=""> <88210000> 4821ed21</div>
<div style=""><br>
</div>
<div style="">So force lockup work queue flush to fix this
problem.</div>
<div style=""><br>
</div>
<div style="">Reviewed-by: Su Weiqiang
<a class="moz-txt-link-rfc2396E" href="mailto:suweiqiang@wxiat.com"><suweiqiang@wxiat.com></a></div>
<div style="">Reviewed-by: Zhou Xuemei
<a class="moz-txt-link-rfc2396E" href="mailto:zhouxuemei@wxiat.com"><zhouxuemei@wxiat.com></a></div>
<div style="">Signed-off-by: Xu Chenjiao
<a class="moz-txt-link-rfc2396E" href="mailto:xuchenjiao@wxiat.com"><xuchenjiao@wxiat.com></a></div>
<div style="">---</div>
<div style=""> drivers/gpu/drm/radeon/radeon_device.c | 3 +++</div>
<div style=""> 1 file changed, 3 insertions(+)</div>
<div style=""><br>
</div>
<div style="">diff --git
a/drivers/gpu/drm/radeon/radeon_device.c
b/drivers/gpu/drm/radeon/radeon_device.c</div>
<div style="">index 59c8a6647ff2..cc1c07963116 100644</div>
<div style="">--- a/drivers/gpu/drm/radeon/radeon_device.c</div>
<div style="">+++ b/drivers/gpu/drm/radeon/radeon_device.c</div>
<div style="">@@ -1625,6 +1625,9 @@ int
radeon_suspend_kms(struct drm_device *dev, bool suspend,</div>
<div style=""> <span style="white-space:pre"> </span>if (r) {</div>
<div style=""> <span style="white-space:pre"> </span>/*
delay GPU reset to resume */</div>
<div style=""> <span style="white-space:pre"> </span>radeon_fence_driver_force_completion(rdev,
i);</div>
<div style="">+<span style="white-space:pre"> </span>} else {</div>
<div style="">+<span style="white-space:pre"> </span>/*
finish executing delayed work */</div>
<div style="">+<span style="white-space:pre"> </span>flush_delayed_work(&rdev->fence_drv[i].lockup_work);</div>
<div style=""> <span style="white-space:pre"> </span>}</div>
<div style=""> <span style="white-space:pre"> </span>}</div>
<div style=""> </div>
<div style="">-- </div>
<div style="">2.17.1</div>
</div>
<div style="color: rgb(0, 0, 0); font-family: arial; font-size:
14px;"><br>
</div>
<div style="color: rgb(0, 0, 0); font-family: arial; font-size:
14px;"><br>
</div>
</div>
<!-- jy5ContentSuffix --><br>
</blockquote>
<br>
</body>
</html>