<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>Then we forgot the proposal I provided before.</p>
<p>I think the below seq may fixed the race condition issue that we
are facing.<br>
</p>
<p>1. stop scheduling new jobs <br>
</p>
<p> for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {<br>
if (adev->vcn.harvest_config & (1 << i))<br>
continue;<br>
<br>
ring = &adev->vcn.inst[i].ring_dec;<br>
ring->sched.ready = false;<br>
<br>
for (j = 0; j < adev->vcn.num_enc_rings; ++j) {<br>
ring = &adev->vcn.inst[i].ring_enc[j];<br>
ring->sched.ready = false;<br>
}<br>
}</p>
<p>2. cancel_delayed_work_sync(&adev->vcn.idle_work);</p>
<p>3. <font size="2"><span style="font-size:11pt;">SOC15_WAIT_ON_RREG(VCN,
inst_idx, mmUVD_POWER_STATUS, 1,<br>
UVD_POWER_STATUS__UVD_POWER_STATUS_MASK);</span></font></p>
<p><font size="2"><span style="font-size:11pt;">4. </span></font>amdgpu_device_ip_set_powergating_state(adev,
AMD_IP_BLOCK_TYPE_VCN, AMD_PG_STATE_GATE);</p>
<p>5. saved_bo</p>
<p>Best Regards!</p>
<p>James<br>
</p>
<div class="moz-cite-prefix">On 2021-05-17 1:43 p.m., Leo Liu wrote:<br>
</div>
<blockquote type="cite" cite="mid:3c6a0bf3-b4b4-0a93-573e-fd9ae02f16a8@amd.com">
<br>
On 2021-05-17 12:54 p.m., James Zhu wrote:
<br>
<blockquote type="cite">I am wondering if there are still some
jobs kept in the queue, it is lucky to check
<br>
</blockquote>
<br>
Yes it's possible, in this case delayed handler is set, so
cancelling once is enough.
<br>
<br>
<br>
<blockquote type="cite">
<br>
UVD_POWER_STATUS done, but after, fw start a new job that list
in the queue.
<br>
<br>
To handle this situation perfectly, we need add mechanism to
suspend fw first.
<br>
</blockquote>
<br>
I think that should be handled by the sequence from
vcn_v3_0_stop_dpg_mode().
<br>
<br>
<br>
<blockquote type="cite">
<br>
Another case, if it is unlucky, that vcn fw hung at that time,
UVD_POWER_STATUS
<br>
<br>
always keeps busy. then it needs force powering gate the vcn
hw after certain time waiting.
<br>
</blockquote>
<br>
Yep, we still need to gate VCN power after certain timeout.
<br>
<br>
<br>
Regards,
<br>
<br>
Leo
<br>
<br>
<br>
<br>
<blockquote type="cite">
<br>
Best Regards!
<br>
<br>
James
<br>
<br>
On 2021-05-17 12:34 p.m., Leo Liu wrote:
<br>
<blockquote type="cite">
<br>
On 2021-05-17 11:52 a.m., James Zhu wrote:
<br>
<blockquote type="cite">During vcn suspends, stop ring
continue to receive new requests,
<br>
and try to wait for all vcn jobs to finish gracefully.
<br>
<br>
v2: Forced powering gate vcn hardware after few wainting
retry.
<br>
<br>
Signed-off-by: James Zhu <a class="moz-txt-link-rfc2396E" href="mailto:James.Zhu@amd.com"><James.Zhu@amd.com></a>
<br>
---
<br>
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 22
+++++++++++++++++++++-
<br>
1 file changed, 21 insertions(+), 1 deletion(-)
<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
<br>
index 2016459..9f3a6e7 100644
<br>
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
<br>
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
<br>
@@ -275,9 +275,29 @@ int amdgpu_vcn_suspend(struct
amdgpu_device *adev)
<br>
{
<br>
unsigned size;
<br>
void *ptr;
<br>
+ int retry_max = 6;
<br>
int i;
<br>
-
cancel_delayed_work_sync(&adev->vcn.idle_work);
<br>
+ for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
<br>
+ if (adev->vcn.harvest_config & (1 <<
i))
<br>
+ continue;
<br>
+ ring = &adev->vcn.inst[i].ring_dec;
<br>
+ ring->sched.ready = false;
<br>
+
<br>
+ for (j = 0; j < adev->vcn.num_enc_rings; ++j)
{
<br>
+ ring = &adev->vcn.inst[i].ring_enc[j];
<br>
+ ring->sched.ready = false;
<br>
+ }
<br>
+ }
<br>
+
<br>
+ while (retry_max-- &&
cancel_delayed_work_sync(&adev->vcn.idle_work))
<br>
+ mdelay(5);
<br>
</blockquote>
<br>
I think it's possible to have one pending job unprocessed with
VCN when suspend sequence getting here, but it shouldn't be
more than one, cancel_delayed_work_sync probably return false
after the first time, so calling cancel_delayed_work_sync once
should be enough here. we probably need to wait longer from:
<br>
<br>
SOC15_WAIT_ON_RREG(VCN, inst_idx, mmUVD_POWER_STATUS, 1,
<br>
UVD_POWER_STATUS__UVD_POWER_STATUS_MASK);
<br>
<br>
to make sure the unprocessed job get done.
<br>
<br>
<br>
Regards,
<br>
<br>
Leo
<br>
<br>
<br>
<blockquote type="cite">+ if (!retry_max &&
!amdgpu_sriov_vf(adev)) {
<br>
+ if (RREG32_SOC15(VCN, i, mmUVD_STATUS)) {
<br>
+ dev_warn(adev->dev, "Forced powering gate
vcn hardware!");
<br>
+ vcn_v3_0_set_powergating_state(adev,
AMD_PG_STATE_GATE);
<br>
+ }
<br>
+ }
<br>
for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
<br>
if (adev->vcn.harvest_config & (1 <<
i))
<br>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</body>
</html>