[PATCH v2 1/2] drm/amdgpu: enhance amdgpu_vcn_suspend
James Zhu
jamesz at amd.com
Mon May 17 17:59:33 UTC 2021
Then we forgot the proposal I provided before.
I think the below seq may fixed the race condition issue that we are facing.
1. stop scheduling new jobs
for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
if (adev->vcn.harvest_config & (1 << i))
continue;
ring = &adev->vcn.inst[i].ring_dec;
ring->sched.ready = false;
for (j = 0; j < adev->vcn.num_enc_rings; ++j) {
ring = &adev->vcn.inst[i].ring_enc[j];
ring->sched.ready = false;
}
}
2. cancel_delayed_work_sync(&adev->vcn.idle_work);
3. SOC15_WAIT_ON_RREG(VCN, inst_idx, mmUVD_POWER_STATUS, 1,
UVD_POWER_STATUS__UVD_POWER_STATUS_MASK);
4. amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN,
AMD_PG_STATE_GATE);
5. saved_bo
Best Regards!
James
On 2021-05-17 1:43 p.m., Leo Liu wrote:
>
> On 2021-05-17 12:54 p.m., James Zhu wrote:
>> I am wondering if there are still some jobs kept in the queue, it is
>> lucky to check
>
> Yes it's possible, in this case delayed handler is set, so cancelling
> once is enough.
>
>
>>
>> UVD_POWER_STATUS done, but after, fw start a new job that list in the
>> queue.
>>
>> To handle this situation perfectly, we need add mechanism to suspend
>> fw first.
>
> I think that should be handled by the sequence from
> vcn_v3_0_stop_dpg_mode().
>
>
>>
>> Another case, if it is unlucky, that vcn fw hung at that time,
>> UVD_POWER_STATUS
>>
>> always keeps busy. then it needs force powering gate the vcn hw
>> after certain time waiting.
>
> Yep, we still need to gate VCN power after certain timeout.
>
>
> Regards,
>
> Leo
>
>
>
>>
>> Best Regards!
>>
>> James
>>
>> On 2021-05-17 12:34 p.m., Leo Liu wrote:
>>>
>>> On 2021-05-17 11:52 a.m., James Zhu wrote:
>>>> During vcn suspends, stop ring continue to receive new requests,
>>>> and try to wait for all vcn jobs to finish gracefully.
>>>>
>>>> v2: Forced powering gate vcn hardware after few wainting retry.
>>>>
>>>> Signed-off-by: James Zhu <James.Zhu at amd.com>
>>>> ---
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 22 +++++++++++++++++++++-
>>>> 1 file changed, 21 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> index 2016459..9f3a6e7 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>>> @@ -275,9 +275,29 @@ int amdgpu_vcn_suspend(struct amdgpu_device
>>>> *adev)
>>>> {
>>>> unsigned size;
>>>> void *ptr;
>>>> + int retry_max = 6;
>>>> int i;
>>>> - cancel_delayed_work_sync(&adev->vcn.idle_work);
>>>> + for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>> + if (adev->vcn.harvest_config & (1 << i))
>>>> + continue;
>>>> + ring = &adev->vcn.inst[i].ring_dec;
>>>> + ring->sched.ready = false;
>>>> +
>>>> + for (j = 0; j < adev->vcn.num_enc_rings; ++j) {
>>>> + ring = &adev->vcn.inst[i].ring_enc[j];
>>>> + ring->sched.ready = false;
>>>> + }
>>>> + }
>>>> +
>>>> + while (retry_max-- &&
>>>> cancel_delayed_work_sync(&adev->vcn.idle_work))
>>>> + mdelay(5);
>>>
>>> I think it's possible to have one pending job unprocessed with VCN
>>> when suspend sequence getting here, but it shouldn't be more than
>>> one, cancel_delayed_work_sync probably return false after the first
>>> time, so calling cancel_delayed_work_sync once should be enough
>>> here. we probably need to wait longer from:
>>>
>>> SOC15_WAIT_ON_RREG(VCN, inst_idx, mmUVD_POWER_STATUS, 1,
>>> UVD_POWER_STATUS__UVD_POWER_STATUS_MASK);
>>>
>>> to make sure the unprocessed job get done.
>>>
>>>
>>> Regards,
>>>
>>> Leo
>>>
>>>
>>>> + if (!retry_max && !amdgpu_sriov_vf(adev)) {
>>>> + if (RREG32_SOC15(VCN, i, mmUVD_STATUS)) {
>>>> + dev_warn(adev->dev, "Forced powering gate vcn
>>>> hardware!");
>>>> + vcn_v3_0_set_powergating_state(adev, AMD_PG_STATE_GATE);
>>>> + }
>>>> + }
>>>> for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
>>>> if (adev->vcn.harvest_config & (1 << i))
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20210517/e132c7e1/attachment.htm>
More information about the amd-gfx
mailing list